Question 1

What is a token?

Accepted Answer

A token is a chunk of text that language models process as a single unit. Tokens can be whole words, parts of words, or punctuation. For English text, one token is roughly 4 characters or 0.75 words on average. The exact tokenization depends on the model's encoding.

Question 2

How accurate is this token counter?

Accepted Answer

For OpenAI models (GPT-4o, GPT-4, GPT-3.5), this tool uses the official tiktoken tokenizer and produces exact counts. For Claude and Gemini models, counts are approximations based on the cl100k_base encoding, which is close but not identical to their actual tokenizers.

Question 3

Why does the same text have different token counts for different models?

Accepted Answer

Different models use different tokenizer vocabularies and encoding schemes. GPT-4o uses o200k_base (200K vocabulary), while GPT-4 uses cl100k_base (100K vocabulary). A larger vocabulary can represent common patterns in fewer tokens, which is why GPT-4o often produces lower token counts for the same input.

Question 4

How is the cost calculated?

Accepted Answer

The cost is calculated by multiplying the token count by the model's per-token price. Most providers charge different rates for input (prompt) and output (completion) tokens. This tool shows both costs separately. Prices are based on publicly available API pricing and may change.

Question 5

Does my text get sent to any API?

Accepted Answer

No. Tokenization happens entirely in your browser using a JavaScript implementation of the tiktoken library. Your text never leaves your device.

Token Counter + LLM Cost Calculator

Token Count

Text Stats

Estimated Cost

Related Tools

LLM Token Counter and Cost Calculator

How to Use

Understanding Tokens

Pricing Notes