How Large Language Models Work
Peek under the hood of ChatGPT and Claude — no PhD required.
How Large Language Models Work
You don't need to be a data scientist to understand LLMs. Here's the simplified version — and once you get this, everything else in this course makes more sense.
Watch: 3Blue1Brown — Large Language Models explained briefly (excellent visual walkthrough)
Training: Reading the Internet
An LLM is trained by reading billions of pages of text — books, websites, articles, code, conversations. During training, it learns patterns: what words tend to follow other words, how sentences are structured, what facts are commonly stated.
It doesn't memorize pages verbatim. Instead, it builds a statistical model of language — a compressed understanding of how humans communicate.
Think of training like this: imagine reading every book, every Wikipedia article, every forum post ever written. You wouldn't memorize them word-for-word, but you'd develop an incredibly rich sense of how language works, what topics relate to each other, and what good writing looks like. That's what the AI has — but at superhuman scale.
The Transformer Architecture
The key innovation is called attention. When processing your prompt, the model pays "attention" to which words relate to other words.
In the sentence "The cat sat on the mat because it was tired," the model learns that "it" refers to "the cat," not "the mat."
This seems obvious to you, but for a computer to figure this out automatically — across millions of sentences — was a massive breakthrough.
This ability to track relationships across long passages is what makes modern AI so capable. Before Transformers (2017), AI could barely handle a single paragraph. Now it can reason across 100,000+ words.
Watch: 3Blue1Brown — Transformers, the tech behind LLMs (deep dive into attention)
How It Generates Responses
When you send a message, the AI follows these steps:
- 1Tokenizes your input (breaks it into pieces called tokens)
- 2Processes the tokens through dozens of layers of the neural network
- 3Predicts the most likely next token
- 4Repeats — generating one token at a time until the response is complete
Each token generation considers the entire conversation so far. That's why AI can stay on topic across long chats.
The AI generates its response one word at a time, from left to right. When you see it "typing" in ChatGPT, that's not a display trick — it's literally deciding each word as it goes. This is why the beginning of a response influences the rest. If the AI starts down a wrong path, the rest of the answer follows that path. You can fix this by saying "Wait, let's reconsider" and it will start fresh.
Important Mental Model
The AI doesn't "know" things the way you do. It has learned statistical associations. When it says "Paris is the capital of France," it's not recalling a fact from memory — it's producing the tokens that most naturally follow the pattern of the conversation.
This is why AI can be confidently wrong (called hallucination). The statistically likely response isn't always the correct one. The AI has no way to flag "I'm not sure about this" — it just produces whatever tokens are most probable. Always verify facts, especially numbers, dates, and citations.
Parameters and Model Size
| Model | Parameters | Context Window | Released |
|---|---|---|---|
| GPT-3.5 | 175 billion | 16K tokens | 2022 |
| GPT-4 | ~1.7 trillion | 128K tokens | 2023 |
| Claude 3.5 | Undisclosed | 200K tokens | 2024 |
| Gemini 1.5 | Undisclosed | 1M tokens | 2024 |
Parameters are the "knobs" the model adjusts during training. More parameters generally means more capability, but also more cost to run.
You don't need to memorize these numbers. The key insight: models are getting bigger, faster, and cheaper every 6-12 months. Whatever limitations you hit today will likely be gone within a year. Focus on learning the skills of working with AI — those transfer across every model upgrade.
Exercises
0/3What is a "token" in the context of LLMs?
Why do AI models sometimes "hallucinate" (state incorrect things confidently)?
Ask an AI to explain how it generates text, then compare its answer to what you learned in this lesson. Did it get anything wrong or oversimplified?
Hint: Pay attention to whether it claims to "understand" or "think" — those are simplifications.