Daylila
How AI actually works

Lesson 3 of 13

Learning means tuning the numbers

Explain that training adjusts a model's numbers to shrink its error on examples.

01 · Learn · the idea

In the last lab you set the model’s two knobs by hand and eyeballed the fit — nudging the line until it looked about right. A machine has no eyes. It can’t glance at a chart and think “close enough.” So how could it ever tell a good setting of the knobs from a bad one? It needs the one thing that turns “looks right” into something a machine can chase: a number for how wrong it is.

The miss on a single example

Start with one flat. It really sold for £210,000. Your model, with its current knobs, guesses £200,000. The gap between them — £10,000 — is the miss on that flat. Guess too high, too low, doesn’t matter; what counts is the distance between the guess and the truth.

That distance is something a machine can compute without understanding houses at all. Subtract, take the size of the gap, done. One flat, one number: how far off we were.

Add up every miss and you get one score

A model doesn’t face one flat, it faces all of them. So do the obvious thing: work out the miss on every example, then add them up. That total is the model’s error — sometimes called its loss. One number that says how wrong the whole model is, across all the data at once.

Now “good knobs” has an exact meaning at last. Good knobs are the knobs that make the total error small. The best possible knobs are the ones that make it as small as it can go. We’ve quietly turned a vague wish — fit the data well — into a precise target: make this number small. That shift is everything. A machine can’t chase “fit well.” It can absolutely chase “make this number small.”

A worked example: three lines, three scores

Take the same five flats and keep the base B fixed at £20,000, so only the slope A changes. We’ll add up the misses, in thousands of pounds, for three settings.

The flats: 50 m² → £170k, 60 → £210k, 70 → £230k, 80 → £250k, 90 → £300k.

Try A = £2,000/m². Guesses: £120k, £140k, £160k, £180k, £200k. Misses: 50, 70, 70, 70, 100. Total error = £360k. The line is far too shallow — it sits well below every real price.

Try A = £3,000/m². Guesses: £170k, £200k, £230k, £260k, £290k. Misses: 0, 10, 0, 10, 10. Total error = £30k. Almost every guess lands on or beside the real price. This line runs through the middle of the data.

Try A = £4,000/m². Guesses: £220k, £260k, £300k, £340k, £380k. Misses: 50, 50, 70, 90, 80. Total error = £340k. Too steep now — the line shoots up past the real prices.

Line up the three scores: 360, then 30, then 340. The error falls as A climbs from 2,000, bottoms out around 3,000, then rises again as A overshoots. The best setting sits at the bottom of that valley. And notice: you found it without once deciding a line “looked good.” You compared three numbers and picked the smallest.

Learning is a search for the bottom of the valley

Here is the whole idea of learning, in one sentence: learning is the search for the knobs that make the error smallest. Nothing more mysterious than that. The machine doesn’t grasp what a house is, or why space costs money. It has a number that measures its own wrongness, and it hunts for the knob settings that push that number down.

You can even feel the shape of the hunt. From a bad setting, try nudging a knob a little. If the error drops, you nudged the right way — keep going. If the error rises, you went the wrong way — turn back. Repeat, and you walk downhill toward the smallest error you can reach. That’s it. That’s the engine.

Why this is the hinge of the whole field

Everything grand about modern AI — the billions of knobs, the mountains of data, the weeks of computing — is in service of this single, humble loop: measure the error, nudge the knobs to make it smaller, repeat. The reason a machine can “learn” at all is that we found a way to score its wrongness as a plain number, and a plain number is something a machine can relentlessly push down.

It’s worth noticing what quietly went missing. Nowhere in here does the model understand anything. It has no idea what a flat is. It is rolling numbers downhill to shrink a score we defined. That gap — real skill on the outside, no understanding on the inside — sits under every impressive and every unsettling thing AI does, and we’ll keep meeting it. But first, the exact downhill walk: the training loop, next, is where the machine finally learns on its own.

02 · Try · the lab

03 · Check · quick quiz

1. Why does a model need an error (or 'loss') number at all?

  • To show the user how confident it is
  • To turn the vague goal 'fit well' into a precise target a machine can chase: make this number small
  • To store the correct answers so it can look them up
  • To decide how many parameters it should have
Answer

To turn the vague goal 'fit well' into a precise target a machine can chase: make this number small — A machine can't chase 'looks right' — it has no eyes. Error is a single number measuring total wrongness, so 'find good knobs' becomes 'make this number as small as possible,' which a machine can actually do.

2. Keeping B fixed, a model scores total error £360k at A=£2,000/m², £30k at A=£3,000, and £340k at A=£4,000. What does that tell you?

  • The model is broken because the errors are so different
  • A should be as large as possible
  • The best setting sits near A=£3,000, at the bottom of the valley where error is smallest
  • Error has nothing to do with which A is best
Answer

The best setting sits near A=£3,000, at the bottom of the valley where error is smallest — Error falls as A rises from 2,000, bottoms out near 3,000, then climbs again as the line overshoots. Learning is the search for that bottom — the knobs with the smallest error — found by comparing numbers, not by eyeballing.

3. After dragging the line to its best fit, some misses still aren't zero. What does that mean?

  • You didn't drag carefully enough; a perfect line exists
  • The real data is bumpier than a straight line, so learning finds the least-wrong setting, not a flawless one
  • The model has too few parameters to make any prediction
  • The error number must be calculated wrong
Answer

The real data is bumpier than a straight line, so learning finds the least-wrong setting, not a flawless one — Real flats don't sit on a perfect line, so no straight model can hit every price. Training doesn't chase perfection — it finds the knobs with the smallest total error it can reach. Least-wrong, not never-wrong.