Missing DataClinical TrialsMethods Critique

Last Observation Carried Forward: When Yesterday's Outcome Pretends the Patient Stopped Changing

June 10, 2026·16 min read

Anas H. Alzahrani, MD PhD MPH

Department of Preventive Medicine and Public Health

Faculty of Medicine, King Abdulaziz University

Few methods in applied clinical research have aged as strangely as last observation carried forward. It is easy to explain, easy to code, and easy to defend with the sort of confidence that should make reviewers nervous.

The basic move is simple: if a patient disappears before the endpoint visit, freeze their outcome at the last recorded value and call that the final answer. Convenient, yes. Neutral, no. LOCF smuggles in a trajectory assumption about exactly the patients whose trajectories are already most uncertain.

The Core Decision Rule

Do not ask whether LOCF keeps everyone in the analysis. Ask whether the post-dropout path it assumes is clinically believable for the patients who left.

Decision rule:

If patients could plausibly improve, worsen, relapse, or die after their last observed visit, then LOCF is not a harmless imputation shortcut. It is a structural outcome assumption.

That assumption can exaggerate benefit, dilute benefit, or reverse the sign of the effect. The direction depends on who dropped out, when they dropped out, and what would have happened next. In other words, the supposedly simple method becomes simple only by refusing to look at the hard part.

Why LOCF Keeps Surviving

It feels conservative

Analysts often imagine that freezing a dropout prevents over-optimism. Sometimes it does. Sometimes it preserves a transient response that would have vanished a week later.

It respects sample size theater

Nobody likes watching randomized patients disappear from the analysis table. LOCF offers the visual comfort of completeness without the inferential honesty of a real missing-data strategy.

It turns dynamics into bookkeeping

Longitudinal disease courses are messy. LOCF replaces that mess with one frozen number and hopes nobody asks whether symptoms, biomarkers, or survival states usually stop moving on command.

A Concrete Clinical Example

Imagine a 12-week randomized pain trial. The active arm improves quickly in the first month, but gastrointestinal adverse effects push a subset of patients out by week 6. Control patients improve more slowly yet stay on treatment more often.

What LOCF does

It takes the last pain score from each dropout, freezes it, and treats that frozen score as if the patient remained stable through week 12.

What may actually happen

Patients who leave because of toxicity may discontinue the drug, rebound symptomatically, seek rescue therapy, or have unrecorded worsening. Stability is not the default just because the dataset ran out.

Why the estimate drifts

The active arm can keep the early benefit of patients who tolerated treatment poorly, while the control arm may be penalized or flattered differently depending on its dropout pattern.

This is why LOCF is not just about missingness. It is about outcome trajectory. The method behaves as if the patient became a screenshot.

Interactive LOCF bias explorer

When dropouts keep changing after they leave, LOCF quietly edits the treatment effect

This toy model uses a continuous outcome where higher gain means more improvement by week 12. LOCF freezes each dropout at the last recorded value, even if the patient would have improved further or deteriorated after that point.

LOCF distortion2.0 pointsapparent effect minus true effect

Treatment-arm dropout rate: 28.0%

Higher and more differential dropout gives LOCF more room to invent the endpoint.

Control-arm dropout rate: 14.0%

Even if both groups drop out, different rates can turn a convenience method into asymmetric bias.

Mean week-12 gain among treatment completers: 24.0 points

Mean week-12 gain among control completers: 15.0 points

Last observed gain before treatment dropout: 18.0 points

Last observed gain before control dropout: 11.0 points

True change after treatment dropout to week 12: -8.0 points

Negative values mean the patient worsens after leaving. Positive values mean improvement continues offstage.

True change after control dropout to week 12: -2.0 points

True week-12 effect

5.9 points

What the study would estimate if all patients were actually observed through the endpoint.

LOCF-imputed effect

7.9 points

The treatment contrast after every dropout is frozen in amber at the last visit.

Bias direction cue

LOCF exaggerates benefit

The sign flips depending on who drops out and what would have happened after the last observed visit.

Quantity	Value	Why it matters
Treatment true mean gain	20.1 points	Includes the post-dropout path LOCF refuses to acknowledge.
Treatment LOCF mean gain	22.3 points	Every dropout is treated as if their outcome trajectory flatlined after the last visit.
Control true mean gain	14.2 points	The same missing-data mechanism can bias the comparator differently.
Control LOCF mean gain	14.4 points	Equal methods do not imply equal bias when dropout timing and prognosis diverge by arm.

How to read the toy model

This is a teaching device, not a longitudinal mixed model. It strips the issue down to the uncomfortable part: the last observed value is rarely the last meaningful value.

Decision rule

If the unobserved post-dropout path matters clinically and dropout differs by arm or prognosis, LOCF is not conservative. It is just unverified.

How LOCF Usually Fails

Failure mode	What LOCF assumes	Why that is risky
Adverse-event dropout after early response	Improvement would have persisted unchanged off treatment.	This can exaggerate efficacy by preserving a temporary gain that might have decayed quickly.
Lack-of-efficacy dropout	The patient would have stayed stuck at the same poor value.	If outcomes would have worsened further, LOCF can hide treatment failure by stopping the decline early.
Natural improvement continues after dropout	No additional recovery occurs once observation ends.	The method can unfairly flatten recovery and dilute a real treatment effect.
Differential timing of dropout across arms	A frozen week-4 score and a frozen week-10 score are equally defensible.	Earlier dropout means more unobserved time, so the amount of hidden trajectory being guessed can differ sharply by arm.

What Reviewers Should Ask Instead of Nodding at the Imputation Footnote

Red flags

Dropout differs meaningfully between arms.
Patients leave for reasons related to efficacy, tolerability, or prognosis.
The outcome is expected to keep evolving after the last observed visit.
LOCF is described as conservative without a clinical argument for why.
No sensitivity analysis explores alternative post-dropout trajectories.

Better questions

What estimand is the analysis targeting after intercurrent events and discontinuation?
Would a mixed model, multiple imputation, or explicit sensitivity analysis match that estimand better?
How much of the endpoint depends on unobserved post-dropout behavior?
Are reasons for missingness documented well enough to defend MAR?
Does the conclusion survive a plausible MNAR stress test?

When a Frozen Value Might Be Defensible, and Why That Still Does Not Rescue LOCF as a Default

There are narrow cases where a carried-forward value approximates the scientific question reasonably well. A permanently irreversible outcome, a very short gap between the last visit and endpoint, or a prespecified estimand tied to treatment discontinuation can shrink the damage.

Even there, the burden is on the analyst to show why the frozen value corresponds to the estimand rather than merely to software convenience. "We used LOCF because that is what prior studies did" is not a method; it is folklore.

A Practical Replacement Hierarchy

Situation	Usually better move	Why
Repeated continuous outcomes under a treatment-policy estimand	Mixed model for repeated measures	Uses observed longitudinal structure rather than pretending the trajectory stopped moving.
Covariate- and history-rich missingness under plausible MAR	Multiple imputation aligned to the analysis model	Makes the assumptions explicit and lets the imputation model learn from observed patterns.
Concern that missingness may be MNAR	Pattern-mixture or tipping-point sensitivity analysis	Forces the paper to show how much hidden deterioration or recovery would change the conclusion.
Outcome changes meaningfully after discontinuation	Explicit estimand choice before method choice	The real problem is often conceptual: what outcome after what intercurrent event are you trying to estimate?

Where Aqrab Fits

LOCF tends to survive in manuscripts because it sits in the methods section wearing a vintage respectability blazer. The abstract sounds serious. The tables look complete. The missing-data assumption is hiding in one acronym.

That is the sort of quiet methodological overreach Aqrab is built to catch. If you want a fast critique of whether the estimand, dropout story, and analysis actually agree, start with Aqrab. If your methods team wants those checks embedded into a reproducible review workflow, the developer tools are the natural next stop.

The Bottom Line

LOCF does not solve missing outcomes. It replaces them with a clinical fairy tale: whatever was true at the last visit stayed true afterward. Sometimes that tale is optimistic. Sometimes punitive. It is almost never assumption-free.

When the endpoint keeps evolving after patients disappear, the last observed value is not a destination. It is just the last page you managed to read.

Keep reading

Don't stop at one method.

Good methods judgment comes from contrast. Read the neighboring guides, see where the assumptions diverge, and avoid treating every observational problem like it needs the same hammer.

Browse full archive

Related guides

Missing Data

Jump-to-Reference Imputation: When Missing Outcomes Start Borrowing the Control Arm's Future

A practical guide to jump-to-reference imputation for clinical researchers. Covers what J2R assumes after treatment discontinuation, when it helps sensitivity analysis, and when it quietly answers the wrong estimand.

2026-06-12 · 15 min read

Trial Design

Adaptive Enrichment Trials: When Precision for One Subgroup Pretends to Be Evidence for Everyone

A practical guide to adaptive enrichment trials for clinical researchers. Covers predictive versus prognostic enrichment, assay timing, multiplicity, external validity, and what reviewers should demand before trusting a biomarker-selected win.

2026-06-19 · 16 min read

Biomarkers

Surrogate Endpoints: When a Biomarker Improvement Pretends to Be Patient Benefit

A practical guide to surrogate endpoints for clinical researchers. Covers validated versus merely plausible surrogates, classic failure modes, and what reviewers should demand before trusting a biomarker-driven trial claim.

2026-06-17 · 16 min read

Previous guide

← Time Zero Alignment: When Your Cohort Starts Counting Before Treatment Does

Next guide

Confounding by Contraindication: When the Untreated Group Is Too Fragile for the Therapy →