Question 1

Why are my counts slightly different from OpenAI's tokenizer?

Accepted Answer

We use a character-based approximation (~96% accuracy) so this runs instantly in-browser without downloading a 3MB tokenizer. For byte-exact counts on a single prompt, use platform.openai.com/tokenizer or the official SDK tokenizer library.

Question 2

Do prompt caching and batch APIs reduce the cost?

Accepted Answer

Yes — often 50–90%. OpenAI caches prompts with identical prefixes at 50% off. Anthropic prompt caching is 90% off for cache hits. Batch API (24-hour turnaround) is 50% off on both. Factor these in for production workloads.

Question 3

How do I count tokens for images or audio?

Accepted Answer

Images: OpenAI uses ~85 tokens per 512×512 tile (detail=low) or 170+ tiles at high detail. Claude uses ~1,300 tokens per 1,000×1,000 image. Audio (Whisper, Gemini): roughly 1 token per 4 seconds of speech.

Question 4

Is the data I paste being sent anywhere?

Accepted Answer

No. This calculator runs entirely in your browser — text never leaves your device. Check your DevTools Network tab if you want to verify.

Model	Input tokens	Context %	Input $	Output $	Total
GPT-4o / GPT-4o-mini	29	0.0%	$0.00	$0.01	$0.01
GPT-4 Turbo	29	0.0%	$0.00	$0.02	$0.02
GPT-3.5 Turbo	27	0.2%	$0.00	$0.00	$0.00
Claude Opus 4	30	0.0%	$0.00	$0.04	$0.04
Claude Sonnet 4	30	0.0%	$0.00	$0.01	$0.01
Claude Haiku 4	30	0.0%	$0.00	$0.00	$0.00
Gemini 2.5 Pro	28	0.0%	$0.00	$0.00	$0.00
Gemini 2.5 Flash	28	0.0%	$0.00	$0.00	$0.00

AI Token Counter — GPT, Claude, Gemini

Full breakdown by model

How token counting actually works

Estimation vs exact counting

Context window — what "2M tokens" actually costs

Frequently Asked Questions