← AI FoundationsLesson 04 of 10
Tokens, Context & Why AI Forgets
A model reads and writes in chunks called tokens — and it can only hold a limited number at once. When that limit fills, the oldest fall away. That, not the tokens themselves, is why AI "forgets."
The one mental model
Picture a fixed-size working desk. Everything on it, the model can see. As the conversation grows the desk fills, and the oldest notes slide off the back. The model isn't ignoring you — that text is simply no longer on the desk.
Key terms
Token
A chunk of text — sometimes a whole word, sometimes a piece of one, sometimes just a comma. Both your input and the model's output are counted in tokens.
Tokenization
How text is split into tokens. There's no universal rule — every model splits the same sentence its own way.
Context window
The maximum number of tokens a model can hold at once. The "desk." Fixed size.
Hidden tokens
The system prompt and any uploaded file take up the window too — before you type a word. A big file crowds out the conversation.
The misconception to drop
✕"It's ignoring me / tokens make it forget / it remembers me between chats."
✓The context window is a fixed token budget. When the conversation exceeds it, the oldest tokens are pushed out — that's the forgetting. Each new chat starts with an empty window, and nothing carries over unless a memory feature saves it.