Daylila
How AI actually works

Lesson 10 of 13

Bias: the mirror of the data

Explain how bias enters through the data, and why it is hard to remove.

01 · Learn · the idea

A company keeps every record of who it has hired over the years. It wants to speed up screening, so it trains a model on that history: here are the people we said yes to, here are the ones we said no to — learn the pattern. The model does exactly what you’d expect from earlier in this course. It hunts through the examples and finds the shape of a “yes”. But the history it learned from was skewed. So the model learned the skew, and now it applies it to every new person, fast and without a pause. Nobody told it to be unfair. It just held up a mirror.

The model has no values — it has a mirror

By now you know a model learns the patterns in its training data. It has no idea what any pattern means. It doesn’t know what a person is, what fair is, what a job is. It sees columns of numbers with “yes” or “no” attached, and it fits itself to reproduce those answers.

So if the past answers carry an unfair lean — because the people making those old decisions leaned that way, for reasons that had nothing to do with who was actually good at the work — the model can’t tell the difference. To the model, the unfair lean is the pattern. It is the strongest, most reliable signal in the data for predicting “yes”. So it learns it, hard.

This is worth being sober about. The model isn’t malicious and it isn’t broken. It’s doing its one job perfectly: reflect the data. The unfairness didn’t come from the machine. It came from the world the data recorded. The machine just made that unfairness automatic, scaled it up, and hid it inside a system that looks neutral because it’s made of maths.

It doesn’t just copy the skew — it can sharpen it

Here’s the part that catches people off guard. You might expect a model trained on a skewed pile to copy the skew — no better, no worse. It often does something worse. It amplifies it.

Think about why. The model is rewarded for one thing: matching the training answers as closely as possible. If leaning toward one group scored well on the old data, then leaning even harder toward that group scores even better. The lean is the winning move, so the model plays it to the hilt. A history that was 80/20 can come out of the model looking like 88/12. The mirror doesn’t just reflect the tilt — it exaggerates it, because exaggerating it is what fits the skewed examples best.

The stubborn part: you can’t just delete the field

So you spot the problem and reach for the obvious fix. The unfairness is about which group someone is in — so delete that field. Strip “group” out of the data entirely, retrain, and now the model literally cannot see it. Problem solved?

Almost never. The model finds proxies — other fields that quietly stand in for the one you removed. A postcode. The school someone attended. A hobby. A single word buried in an application. If any of those correlate with the group you deleted, the model leans on them instead and rebuilds the exact same skew through the back door. You took away the label. You didn’t take away the pattern the label was riding on. The tilt was woven through the whole dataset; pulling one thread doesn’t unweave it.

A worked example

Take that hiring screen. In the company’s past hires, 80% were from Group A and 20% from Group B — a historical skew, not a fact about who does the job well.

Train on it, field visible. Now run 100 fresh candidates through, all equally qualified, 50 from each group. The model recommends Group A about 88 times out of 100. More lopsided than the 80/20 it was shown. That’s the amplification: it turned a tilt into a landslide.

Delete the group field and retrain. Run the same 100. The model still recommends Group A about 82 times out of 100 — barely moved. Postcode and former school correlate with group, so the model reads the group off those proxies and leans the same way. Hiding the label did next to nothing.

Now fix the data instead. Rebalance the training history to 50/50, and check the model’s decisions across both groups rather than only its overall accuracy. Run the same 100 candidates: the recommendations even out to roughly 50/50. The model didn’t grow a conscience. Its mirror was simply given a fairer thing to reflect.

That’s the whole shape of it. You don’t fix bias by pretending the model is neutral, or by hiding the sensitive field and hoping. You fix it upstream, in the data, and you check the outcomes group by group — because a model trained on a skewed world will faithfully mirror a skewed world.

On the whole

A model is a mirror held up to its data. Point it at a fair record and it reflects fairness. Point it at a tilted record and it reflects the tilt — often sharpened, and stubborn enough that deleting the obvious field just sends it hunting for proxies. There’s no villain in the machine and no conscience either; there’s only the pattern it was shown.

That’s the humbling thing to carry. These systems feel objective precisely because they’re made of numbers and run without a flicker of hesitation. But “objective” and “fair” are not the same word. A model can be perfectly consistent and consistently wrong about who deserves a yes. Seeing that clearly — knowing the fairness of the output is decided long before the model runs, in the data we chose to feed it — is most of what it takes to use one of these systems without being fooled by how calm and certain it sounds.

02 · Try · the lab

03 · Check · quick quiz

1. A model trained on a company's past hires recommends Group A far more often than Group B, even for equally qualified candidates. What is the most accurate description of what happened?

  • The model decided Group A candidates are better and acted on that judgment
  • The past hiring data leaned toward Group A, and the model learned that lean as if it were the pattern
  • The model has a hidden bias built in by its designers
  • The model made a random error that more training would fix
Answer

The past hiring data leaned toward Group A, and the model learned that lean as if it were the pattern — The model has no values and made no judgment. It reflected the skew already sitting in the data it was shown. The unfairness was in the record; the model just made it automatic and fast.

2. The past data was 80% Group A, yet on 100 equally-qualified candidates the model recommends Group A about 88 times. Why does the number come out MORE lopsided than the data?

  • The model copies the skew exactly, so 88 must be a mistake
  • The model added extra Group A candidates to the test set
  • Leaning even harder toward the majority group scores best on the skewed data, so the model amplifies the tilt rather than just copying it
  • 88 is within rounding error of 80 and means nothing
Answer

Leaning even harder toward the majority group scores best on the skewed data, so the model amplifies the tilt rather than just copying it — The model is rewarded only for matching the training answers. On tilted data, leaning harder fits better — so it sharpens the skew instead of merely reproducing it. A mirror that exaggerates.

3. To fix the bias, someone deletes the 'group' field so the model literally can't see it, then retrains. The model still recommends Group A about 82 of 100. What went wrong?

  • The deletion failed and the field was still in the data
  • Deleting a field always makes a model more biased, not less
  • The model memorized the earlier results before the field was removed
  • The model found proxies — other fields like postcode or former school that correlate with group — and rebuilt the same skew
Answer

The model found proxies — other fields like postcode or former school that correlate with group — and rebuilt the same skew — The tilt was woven through the whole dataset. Removing the label leaves the correlated fields behind, and the model reads the group off those proxies. You took away the name, not the pattern.

4. Which approach actually brings the recommendations close to 50/50 for equally-qualified candidates?

  • Rebalancing the training data and checking outcomes across both groups
  • Hiding the sensitive field so the model can't discriminate
  • Telling the model to be fair in its instructions
  • Training on even more of the same skewed history
Answer

Rebalancing the training data and checking outcomes across both groups — Bias enters through the data, so it has to be fixed there — balance the examples and measure outcomes group by group, not just overall accuracy. The model gains no conscience; its mirror is simply given a fairer thing to reflect.