Lesson 8 of 13
How a clinical trial works
Explain the clinical-trial pipeline — phase 1 (is it safe?), phase 2 (is there a signal?), phase 3 (is it better than what we already have?) — and why the control group is the whole point: without comparing against a placebo or standard care, and without blinding, natural recovery and the placebo effect masquerade as the drug working.
01 · Learn · the idea
A new pill is tested for back pain. The people who take it are asked a month later how they feel, and 72 of every 100 say they are better. Seventy-two percent. That sounds like a strong drug. A headline would write itself. But hold the celebration — because you have just been shown the single most misleading number in medicine. The question that decides whether the pill works isn’t “how many got better.” It’s “how many more got better than would have anyway.” And answering that takes a machine built specifically to stop you fooling yourself: the clinical trial.
Why “people got better” proves nothing
Most things get better on their own. Back pain eases. Colds clear. Even some serious conditions wax and wane. This is natural recovery, and it happens whether or not anyone takes a pill.
On top of that sits the placebo effect: people who believe they are being treated genuinely report feeling better, sometimes measurably so, even when the “treatment” is a sugar pill. Expectation is not nothing. It shifts how people rate pain, mood, fatigue — the very things trials often measure.
So when 72 of 100 people improve after taking a drug, that 72 is a mix: some recovered naturally, some improved on expectation, and maybe some improved because the drug did something. You cannot tell which is which by looking at the treated group alone. The number is real and almost useless.
The control group is the whole point
The fix is to run the experiment twice at once. Give the real drug to one group. Give an identical-looking dummy — a placebo, or, where withholding care would be wrong, the existing standard treatment — to a second group, the control group. Both groups are made as alike as possible, then watched the same way.
Now the comparison does the work. Whatever happens in the control group is everything except the drug — natural recovery plus placebo effect plus the ordinary noise of measuring people. The real effect of the drug is the difference between the two groups, not the raw number in the treated one.
Run our back-pain trial properly. The drug group: 72 of 100 improve. The placebo group: 65 of 100 improve. The honest effect of the drug is not 72 percent. It is 72 minus 65 — about 7 percentage points. The same result, read two ways: a miracle if you hide the control, a modest helper if you show it. The control group is what turns a number into evidence.
Blinding stops expectation from leaking in
There is one more way to fool yourself, and it is subtle. If patients know they got the real drug, their expectation runs hotter — the placebo effect lands harder on them than on the control group, and that gap masquerades as a drug effect. If the doctors know who got what, they unconsciously rate the treated patients more generously, ask leading questions, watch them more closely.
So good trials are blinded: patients don’t know which group they’re in, and in a double-blind trial neither do the staff assessing them, until the data is locked. Blinding doesn’t remove the placebo effect — it makes it land equally on both groups, so it cancels out in the difference. Unblind the back-pain trial and the placebo group might drop to 60 while expectation pushes the drug group to 76 — the gap inflates from 7 to 16, all of it an artefact of who knew what.
Three questions, in order: the phases
A drug doesn’t face one trial but a sequence, each asking a harder question. This is the phase structure.
Phase 1 — is it safe? A small group, often healthy volunteers, takes the drug at rising doses. The question is not “does it work” but “what does it do to a body, and at what dose does harm start.” Most of what’s learned here is about toxicity and tolerable amounts.
Phase 2 — is there a signal? Now patients who actually have the condition take it, in a larger group, usually against a control. The question is “does this look like it’s doing anything at all, and at what dose.” A weak or absent signal ends most drugs here.
Phase 3 — is it better than what we already have? The big one: hundreds or thousands of patients, controlled and blinded, often compared against the current best treatment, not just a sugar pill. The question is not “does it do something” but “is it worth switching to.” A drug can clear phase 1 and 2 and still fail here by being no better than the cheap old option — which is a perfectly good reason to reject it.
Each phase is a filter. A drug only advances by answering its question yes. The pipeline is built so that the most expensive, largest test comes last, after the cheaper questions have already weeded out most candidates.
A note on what this is
This explains how evidence is built — how researchers decide whether a treatment works. It is not medical advice, and nothing here tells you what to take or avoid. Reading a trial well is a thinking skill, not a prescription.
On the whole
The clinical trial is one of the few places humans have engineered a defence against their own wishful thinking. Left alone, we see a pattern, feel a hope, and call it proof. The control group and the blind are a confession built into the method: we know we will fool ourselves, so we have arranged things so we can’t. That posture — assume you’re biased, then build the comparison that exposes it — reaches far past medicine. Any time you’re tempted to say “I tried it and I got better,” the quiet question waits: better than what? You are inside the same trap the trial was built to escape, and knowing the shape of it is the first step out.
02 · Try · the lab
03 · Check · quick quiz
1. A pill is tested on 100 people with back pain; 72 report they're better a month later. What does that 72% by itself prove about the drug?
- The drug works — 72% is a strong response
- Almost nothing, because some of those people would have improved anyway
- The drug fails, since 28% did not improve
- The drug is unsafe at that dose
Answer
Almost nothing, because some of those people would have improved anyway — Back pain often eases on its own, and expectation alone makes people report improvement. Without a comparison group, you can't tell how much of the 72% is the drug versus natural recovery and the placebo effect.
2. In a blinded trial, the drug group improves 72% and the placebo group improves 65%. What is the drug's real effect?
- 72 percentage points
- 65 percentage points
- About 7 percentage points — the difference between the groups
- About 137 percentage points — the two groups added together
Answer
About 7 percentage points — the difference between the groups — The placebo group captures everything except the drug: natural recovery plus the placebo effect. The drug's true effect is the difference, 72 − 65 = 7 — modest, but real.
3. Why are good trials blinded, so patients (and ideally the assessors) don't know who got the real drug?
- To stop the placebo effect from happening at all
- So expectation lands equally on both groups and cancels out in the difference
- To make the trial cheaper to run
- Because it's required for the drug to be patented
Answer
So expectation lands equally on both groups and cancels out in the difference — Blinding doesn't erase the placebo effect — it makes it hit both groups the same, so it subtracts out. If only the drug group knows it got the real pill, its hotter expectation inflates the apparent gap.
4. A drug clears phase 1 (safe) and phase 2 (a signal), then is rejected in phase 3. What did phase 3 most likely find?
- The drug was toxic at every dose
- The drug did nothing at all
- The drug worked, but was no better than the treatment we already have
- The trial had no control group
Answer
The drug worked, but was no better than the treatment we already have — Phase 3 asks 'is it better than current care?', often comparing against the existing best treatment. A drug can be safe and do something yet still fail by not beating the cheap old option — a perfectly good reason to reject it.