Lesson 9 of 13

Why AI makes things up

Explain why a model can be fluent, confident, and wrong.

01 · Learn · the idea

Ask a text model what water is made of and it answers cleanly: two hydrogen atoms and one oxygen atom. Ask it to summarise a book that does not exist, and it does that just as cleanly — a plot, a genre, maybe a publication year, all delivered in the same steady, confident tone. One answer is true. One is pure invention. And nothing in the way it speaks tells them apart. That second kind of answer has a nickname: a “hallucination”. This item is about where it comes from — and the surprise is that it comes from exactly the same place the right answers do.

The machine only ever predicts

You met the core idea already: a text model predicts a likely next word from patterns, over and over, until it has a sentence. It has no stored list of facts to look up. It has no sense of true or false. It has one move — produce a likely continuation — and it makes that move whether the question is easy or impossible.

Hold on to that. It has one move. There is no second mode where it stops, checks a fact, and says “I don’t actually have this.” Stopping is not something it can do. Saying “I don’t know” would itself just be another predicted continuation — and usually a less likely one than a smooth, helpful-sounding answer. So it keeps going.

When “likely” and “true” line up

Some questions sit on top of a strong pattern. What is water made of? That fact appears in a huge amount of the text the model learned from, always phrased in roughly the same way. So the most likely continuation is the correct one. Likely and true point in the same direction, and the answer comes out right.

This is most of what a good model does. On well-covered ground, prediction and truth agree, and you get a fluent, correct reply. It feels like the machine knows things. It doesn’t — it’s just that on this question, the likely words happen to be the true ones.

When there is nothing to match

Now ask about a made-up book. There is no such book, so there is no pattern behind its contents — nothing in what the model learned describes this thing, because this thing was never real.

Here is the trap. The model does not notice the gap. It cannot notice it, because noticing would require a sense of “there is no pattern here,” and all it has is “what’s the likely next word.” So it does the only thing it can: it fills the gap with the shape of a right answer. A book summary usually has a plot, so it invents a plot. It usually names a genre, so it picks one. Ask for a source and it produces an author, a title, and page numbers — each one shaped exactly like a real citation, none of them real.

The made-up answer is not the machine breaking. It is the machine working normally over empty ground. Same single move, same fluent output — one question had something true to match, and one had nothing, so it made the nothing sound like something.

A worked example: two questions, one machine

Picture one assistant answering two questions back to back.

Question one: “What is water made of?” Strong pattern, well covered. Out comes: “Two hydrogen atoms bonded to one oxygen atom.” True. Confident. Clean grammar.

Question two: “Summarise the novel The Copper Lantern.” There is no such novel. But the model does not say so. It predicts the likely shape of a summary and produces: “A quiet historical drama about a lamp-maker in a coastal town, first published in the early nineteen-hundreds, exploring memory and loss.” Every phrase is plausible. Every phrase is invented. And it arrives in the exact same tone as the water answer — same confidence, same smoothness.

Two questions, one machine, one process. The only difference was on the outside: one had a truth to match and one didn’t. The machine could not feel that difference, so it treated both the same.

Why it looks so convincing

Fluency and clean formatting are precisely what a text model is best at predicting — that is its whole skill. So a fabricated fact comes dressed in the same clothes as a real one: confident tone, tidy grammar, a citation shaped like a citation. There is no wobble in the voice when it’s making things up, because the voice is just predicted words, and predicting smooth words is easy on any topic.

This is the part worth burning in: a confident, fluent sentence is a sign of a strong word-pattern, not a sign of truth. The model’s confidence measures how well the words flow, never whether they match the world.

What actually helps? Two plain things. Give the model the real source text and ask it to work only from that — then there’s a genuine pattern under its feet instead of a gap. Or check its claims yourself against a source you trust. Neither makes the machine understand truth; they just stop it from having to guess into an empty space.

On the whole

Making things up is not a bug someone forgot to fix. It is the ordinary prediction machine doing its ordinary job on ground where the likely answer and the true answer part ways. The model has no notion of true or false — only likely — so it can be fluent, confident, and completely wrong, all at once, with no signal to warn you.

That is a real limit, and it will not fully go away, because it is built into what the thing is. Which asks a little humility of us as readers: the smoothness of an answer tells you the machine found a strong pattern to imitate. It never, by itself, tells you the answer is true. Knowing the difference is now our job, not the machine’s.

02 · Try · the lab

03 · Check · quick quiz

1. A model gives a fluent, confident answer to a question about an obscure fact. Ask it for a source and it produces a real-looking author, title, and page numbers — all fake. Why did it do this instead of saying 'I don't have that'?

Saying 'I don't know' isn't something it can decide to do — its only move is to produce a likely continuation, so it fills the gap with the shape of a citation
It looked up a database, found nothing, and guessed to be helpful
It was trained to lie about sources it hasn't read
A bug caused it to swap the real citation for a fake one

Answer

Saying 'I don't know' isn't something it can decide to do — its only move is to produce a likely continuation, so it fills the gap with the shape of a citation — The model has one move: predict likely words. It has no separate step where it checks whether a source exists. Over a gap, the likely continuation is the shape of a citation — so it produces one, real-looking and false. Same machine, empty ground.

2. You ask a model two questions. It answers both in the same smooth, confident tone. One answer turns out true, one made up. What does the confident tone actually tell you?

That both answers are true, since the model wouldn't sound sure otherwise
That the made-up answer is a rare error mode the model slipped into
That the model found strong word-patterns to imitate — which says nothing about whether either answer matches the world
That the model double-checked both answers before replying

Answer

That the model found strong word-patterns to imitate — which says nothing about whether either answer matches the world — Fluency measures how well the words flow, not whether they're true. The model is best at predicting smooth, confident text on any topic — so a fabrication wears the same tone as a fact. The voice never signals truth.

3. Someone says: 'Hallucination is a bug in the model that engineers just haven't patched yet.' Why is that the wrong way to see it?

It is correct — a future update will stop models from ever being wrong
Making things up is the same prediction machine running normally over ground with no real pattern behind it, not a broken mode bolted on
Hallucination only happens when the model overheats or runs low on memory
It's a bug, but it comes from bad training data alone and clean data removes it entirely

Answer

Making things up is the same prediction machine running normally over ground with no real pattern behind it, not a broken mode bolted on — The model always predicts a likely continuation. When there's a strong pattern, likely equals true and it's right; when there's no pattern, it fills the gap anyway. Both come from the identical process — so a confident wrong answer is always possible by design.

4. Which step genuinely reduces a model's chance of making something up on a factual task?

Asking it to promise it will only tell the truth
Telling it to sound more confident so its answers are more reliable
Giving it the real source text to work from, or checking its claims yourself against a trusted source
Rephrasing the question until the confidence bar reads higher

Answer

Giving it the real source text to work from, or checking its claims yourself against a trusted source — Giving the model real source text puts a genuine pattern under its feet instead of a gap; checking its claims yourself catches the fabrications it can't catch. Promises and higher confidence change the tone, not the truth — the model has no sense of true versus false to appeal to.