Full breakdown by model
| Model | Input tokens | Context % | Input $ | Output $ | Total |
|---|---|---|---|---|---|
| GPT-4o / GPT-4o-mini | 29 | 0.0% | $0.00 | $0.01 | $0.01 |
| GPT-4 Turbo | 29 | 0.0% | $0.00 | $0.02 | $0.02 |
| GPT-3.5 Turbo | 27 | 0.2% | $0.00 | $0.00 | $0.00 |
| Claude Opus 4 | 30 | 0.0% | $0.00 | $0.04 | $0.04 |
| Claude Sonnet 4 | 30 | 0.0% | $0.00 | $0.01 | $0.01 |
| Claude Haiku 4 | 30 | 0.0% | $0.00 | $0.00 | $0.00 |
| Gemini 2.5 Pro | 28 | 0.0% | $0.00 | $0.00 | $0.00 |
| Gemini 2.5 Flash | 28 | 0.0% | $0.00 | $0.00 | $0.00 |
How token counting actually works
Tokens are sub-word units each model's tokenizer produces. English averages roughly 3.6–4.0 characters per token, but this varies: code has more tokens per char (symbols tokenize tight), Chinese/Japanese have fewer chars per token (multi-byte).
Estimation vs exact counting
This tool uses a character-based estimate (±5% vs exact tokenizers) — fast enough to run on every keystroke in your browser. For byte-accurate counts, install tiktoken (OpenAI), @anthropic-ai/tokenizer (Claude), or Google's Gemini tokenizer.
Context window — what "2M tokens" actually costs
Gemini 2.5 Pro's 2M context window sounds free, but you pay for every token you put in. Cramming a 1M-token book costs $1,250 per API call at $1.25/1M input. Use context deliberately.