AI Glossary

What is Token?

A token is the basic unit of text that AI language models process. Rather than reading individual characters or whole words, LLMs break text into tokens, which can be words, parts of words, or punctuation. For English, one token is roughly 3/4 of a word, so 100 tokens equals approximately 75 words. Tokenization is performed by algorithms like BPE (Byte Pair Encoding) or SentencePiece. Token counts matter because they determine API costs (priced per input/output token), context window limits, and processing speed. Different models use different tokenizers, so the same text may produce different token counts across models.

Related Terms

GPT-5.6: Frontier intelligence that scales with your ambition OpenAI Blog · Jul 9 Is this the dawn of the Tokenpocalypse? TechCrunch · Jun 7 The token bill comes due: Inside the industry scramble to manage AI’s runaway costs TechCrunch · Jun 5 ‘What a joke’: Github Copilot’s new token-based billing spurs consternation among devs TechCrunch · May 30 Just like gold and oil, we’ll soon be able to trade AI token futures TechCrunch · May 28

Frequently Asked Questions

What is a token in AI?

A token is the basic unit of text an AI model processes. It can be a word, part of a word, or punctuation. One token is roughly 3/4 of an English word.

Why do tokens matter?

Tokens determine API pricing, context window limits, and processing speed. Understanding token counts helps optimize cost and performance when using AI models.