← Back to Blog
Clinical EpidemiologyScreening StudiesMethods Critique

Overdiagnosis: When Finding More Disease Does Not Mean Saving More Lives

May 19, 2026·16 min read

Anas H. Alzahrani, MD PhD MPH

Department of Preventive Medicine and Public Health

Faculty of Medicine, King Abdulaziz University

Screening papers adore a clean narrative: we found more cases, we found them earlier, and survival after diagnosis improved. Sometimes that is real benefit. Sometimes it is a more efficient way to count people who were never going to be harmed by the disease in the first place.

Overdiagnosis means detecting a real pathological abnormality that would not have become clinically apparent or caused death during the patient's lifetime. The lesion exists. The label is accurate. The clinical payoff is where the story collapses.

The Core Mistake

Researchers often treat more detection as proof of benefit. That shortcut fails because diagnosis is not a patient-centered outcome. A screening program can increase incidence, expand treatment, and improve survival among diagnosed patients even when disease-specific mortality barely moves.

Decision rule:

If screening finds more disease and improves post-diagnosis survival, but mortality and serious morbidity do not fall, assume overdiagnosis is on the table until proven otherwise.

That is not anti-screening. It is pro-endpoint. Medicine does not win because a registry gets busier.

Lead-Time Bias and Overdiagnosis Are Neighbors, Not Twins

ProblemWhat changes?What may stay unchanged?
Lead-time biasDiagnosis happens earlierDate of death
OverdiagnosisWho gets labeled diseasedMortality, symptoms, or quality-adjusted life

A screening program can suffer from both at once. Earlier diagnosis stretches survival time on paper. Overdiagnosis adds biologically quiet cases that almost all survive. Together they can make the diagnosed cohort look heroic while the population outcome shrugs.

A Simple Clinical Example

Think about a cancer screening program that starts detecting many slow-growing thyroid or prostate lesions. Incidence climbs sharply. Biopsies, surgery, follow-up imaging, and patient anxiety all climb with it. Yet advanced disease and mortality barely change.

What looks impressive

More cases detected, earlier-stage disease, and better survival among the diagnosed cohort.

What matters clinically

Fewer deaths, fewer metastatic presentations, less major morbidity, or less burdensome treatment.

What often gets ignored

False positives, unnecessary procedures, lifelong surveillance, and treatment of lesions that would have stayed quiet.

Interactive overdiagnosis explorer

Better survival and more detected cases can coexist with zero mortality benefit

This toy model holds disease-specific deaths constant and lets screening add indolent cases that would never have become clinically important. The diagnosed cohort looks healthier. The population does not.

Key illusion+24 extra labelsDisease mortality stays 26.0 per 1,000

Usual diagnosed cohort

35.0%

Five-year survival when only clinically important cases become diagnosed.

Screen-detected cohort

59.4%

Survival rises because harmless cases join the denominator and almost all survive.

Disease mortality

26.0 / 1,000

Unchanged here by design, which is the point of the warning.

MetricWithout screeningWith overdiagnosing screen
Detected cases per 1,00040.064.0
Five-year survival among diagnosed35.0%59.4%
Disease deaths per 1,00026.026.0
Extra labels without extra lives saved0+24

How to read the toy model

Overdiagnosis is not simply earlier diagnosis. It is diagnosis of lesions that would not have caused symptoms or death during the patient's lifetime. Those cases inflate incidence and make post-diagnosis survival look rosier because they are almost guaranteed to survive.

The honest population-level check is mortality, serious morbidity, and treatment burden, not whether the diagnosed cohort suddenly seems healthier after the screening program starts fishing in quieter water.

Failure Modes That Should Make Reviewers Stiff-Arm the Abstract

Red flagWhy it is weakWhat to ask for instead
Five-year survival is the headlineSurvival from diagnosis is vulnerable to both lead-time bias and overdiagnosis.Disease-specific mortality, all-cause mortality, and treatment burden.
Incidence rises faster than late-stage disease fallsThat pattern suggests extra case-finding without proportional clinical payoff.Stage-specific trends, metastatic disease trends, and downstream intervention rates.
Harms are discussed as logistics, not outcomesBiopsy complications, overtreatment, and anxiety are not clerical side notes.Net-benefit framing that counts harms explicitly.
AI detection model is praised for “finding more positives”More sensitivity can be clinically worse if it mainly harvests indolent disease.Evidence that extra detections improve patient-important outcomes.

What Better Evidence Looks Like

1. Mortality first

Show disease-specific mortality and, when plausible, all-cause mortality. If the benefit is real, it should eventually escape the diagnostic file and appear in the patient.

2. Advanced disease trends

A useful screening program should reduce clinically consequential disease, not just increase the count of low-burden findings.

3. Harm accounting

Report biopsies, surgeries, treatment complications, follow-up cascades, and patient burden with the same seriousness used for the putative benefits.

Decision Rules for Busy Reviewers

  • If survival after diagnosis improves but mortality does not, do not call that proof of benefit.
  • If detected-case incidence rises sharply without a comparable drop in advanced disease, suspect overdiagnosis.
  • If the intervention is an AI detector, ask whether it found more clinically useful disease or simply more disease-shaped pixels.
  • If harms are absent from the summary table, the analysis is probably flattering the screen.

Why This Matters for AI-Era Methodology

Aqrab keeps seeing the same problem in modern detection papers: model performance is treated as a proxy for patient benefit. It is not. A detector can improve classification metrics, increase diagnostic yield, and still worsen the clinical tradeoff if it mostly expands the market for unnecessary labels and interventions.

If you are evaluating a screening or diagnostic study and want a faster methodological stress test, Aqrab can help you pressure-check endpoints, causal claims, and reviewer red flags before the abstract talks you into admiring the wrong number. Start with Aqrab Try or explore the methodology stack at /developers.

Keep reading

Don't stop at one method.

Good methods judgment comes from contrast. Read the neighboring guides, see where the assumptions diverge, and avoid treating every observational problem like it needs the same hammer.

Browse full archive