Landmark Analysis: Useful, Honest, and Frequently Overclaimed
Landmark analysis is one of those methods that can be either admirably honest or quietly misleading, depending on how it is used.
Used well, it gives you a coherent way to study prognosis or treatment effects among patients who are still alive and eligible at a clinically meaningful time point. Used badly, it becomes a decorative way to hide immortal time bias and pretend the original causal question survived intact. It did not.
What Landmark Analysis Actually Does
Pick a fixed time after cohort entry — 30 days, 90 days, 6 months, whatever is clinically defensible. Keep only patients who are still under observation and event-free at that time. Define treatment or exposure status using information available up to the landmark. Then start follow-up from the landmark forward.
Core truth:
Landmark analysis does not repair the original baseline question. It asks a new question: among patients who made it to the landmark, what happens next?
That can be a very good question. It is just not the same as estimating the effect of treatment assignment at baseline in the full eligible population.
Why People Reach For It
The usual setup is awkward timing. Treatment is not assigned neatly at time zero. Patients initiate later, switch after reassessment, or qualify only after surviving long enough to accumulate new information. If you compare “ever treated” versus “never treated” from baseline, the treated group may need to survive long enough to receive treatment. That survival time is not free. It is immortal time wearing a fake mustache.
What landmarking helps with
It aligns eligibility, treatment classification, and follow-up at a shared start time for the subset who remain eligible at that moment.
What it does not help with
It does not magically recover the effect of a baseline treatment strategy in everyone who entered the cohort.
A Fast Example
Suppose you are studying whether starting biologic therapy after hospital discharge reduces one-year readmission in patients with inflammatory disease. Many patients start within the first 60 days, some later, and some never.
If you classify patients at discharge as “treated” when they start anytime during those first 60 days, you create impossible survival credit. Patients must remain alive, uncensored, and untreated-by-definition until the day treatment starts. A patient readmitted on day 12 cannot join the treated group, which quietly makes the treated group look better.
A 60-day landmark analysis would instead ask: among patients who are alive, still observed, and not yet readmitted at day 60, how do outcomes compare between those who started biologics by day 60 and those who did not?
The Estimand Changes. Say So Out Loud.
| Question | Population | Time zero |
|---|---|---|
| Baseline treatment strategy effect | All eligible patients at cohort entry | Baseline |
| Landmark comparison | Patients alive, observed, and event-free at the landmark | Landmark time |
The second question is conditional on surviving and remaining eligible to the landmark. That conditioning is not a footnote. It changes who is being studied and what decision your estimate can inform.
When Landmark Analysis Is Defensible
A real clinical decision happens at the landmark
For example, escalate treatment at 90 days if symptoms persist, or assess transplant eligibility at 6 months.
Exposure status is meaningfully knowable by then
The landmark gives enough time for treatment initiation or response classification without relying on future information after follow-up starts.
The conditional population is scientifically relevant
Sometimes clinicians truly care about prognosis among patients who reach a milestone event-free.
You are willing to report the restriction honestly
Not “we avoided immortal time bias and estimated the treatment effect,” but “we estimated outcomes among landmark survivors.” Much better.
Choosing the Landmark Time
The landmark should come from clinical logic, protocol logic, or a prespecified decision window — not from whichever cutoff makes the Kaplan-Meier curves look dramatic.
1. Make it clinically interpretable
A 30-day post-discharge reassessment window makes sense if that is how care is delivered. Day 47 because the model liked it does not.
2. Make it long enough to classify exposure
If almost nobody can start treatment before the landmark, you built a design that answers nothing interesting.
3. Notice what you lose
Later landmarks discard more early events and focus on a more selected, often healthier, survivor population.
4. Prespecify when possible
If you try five landmarks and publish the prettiest one, that is not sophistication. That is shopping.
The Main Limitation: Conditioning on Survival Can Select a Strange Population
To reach the landmark, patients must avoid the event, avoid censoring, and remain in the study long enough to be classified. That can induce selection problems, especially if the forces determining survival to landmark are related to future prognosis and treatment choice.
- Sicker early-failing patients disappear before analysis starts.
- Treatment groups at the landmark may differ for reasons that got stronger during the pre-landmark period.
- Post-baseline variables used for treatment classification may themselves be affected by prior care.
So yes, landmarking can reduce one design failure while introducing a narrower target population and a new selection structure. That is still sometimes a good trade. It is just not a silent one.
Landmark Analysis Versus Other Fixes
| Approach | Best for | Main cost |
|---|---|---|
| Landmark analysis | Conditional questions about patients who reach a milestone | Changes the estimand and discards early follow-up |
| Time-varying exposure modeling | When treatment status genuinely changes over time | Needs careful time alignment and stronger modeling discipline |
| Target trial emulation / clone-censor-weight | Strategy-level causal questions with grace periods or dynamic assignment | More design work, more assumptions, more machinery |
If the real question is baseline strategy assignment, landmark analysis is often the simpler but less causally faithful option. If the real question is conditional prognosis at a milestone, landmark analysis may be exactly right.
Common Mistakes
1. Pretending the landmark subset is the original cohort
Once you condition on event-free survival to the landmark, you are no longer estimating the same population-level effect.
2. Choosing the landmark after looking at outcomes
A data-driven landmark can turn an already flexible design into a significance vending machine.
3. Using post-landmark information to define exposure
If classification requires peeking past the landmark, you broke the design you were trying to save.
4. Ignoring confounding at the landmark
Restricting the cohort does not make treatment groups exchangeable. It just creates a new baseline where confounding still needs attention.
What Good Reporting Looks Like
- State the landmark time and why it was chosen.
- Report how many patients were excluded before the landmark and why.
- Define treatment or exposure status using only pre-landmark information.
- Describe the target population as patients event-free and under observation at the landmark.
- Show baseline and landmark-time covariate balance, not just original cohort characteristics.
- Explain why landmarking was preferable to time-varying exposure or target trial approaches for this question.
Reviewer Red Flags
- “Ever treated” exposure definitions paired with survival curves starting at cohort entry.
- No count of patients lost before the landmark, as if the subset appeared by magic.
- Language claiming baseline causal effects despite landmark restriction.
- Landmark time chosen with no clinical rationale.
- No acknowledgement that early events are excluded from the estimand.
The Practical Bottom Line
Landmark analysis is not a cheat code. It is a trade.
You trade a muddled and often biased baseline comparison for a cleaner conditional question among patients who reach a clinically meaningful time point. That trade can be excellent when the landmark corresponds to a real decision moment. It can also be a tidy way to answer the wrong question with great confidence.
If you use it, be precise: say who entered the analysis, when follow-up started, what treatment status meant at that moment, and which estimand you gave up. A method does not become rigorous because the timeline looks neater. It becomes rigorous when the question, design, and reporting finally agree with each other.
Keep reading
Don't stop at one method.
Good methods judgment comes from contrast. Read the neighboring guides, see where the assumptions diverge, and avoid treating every observational problem like it needs the same hammer.
Restricted Mean Survival Time: When Hazard Ratios Are Not the Clinical Answer
A practical guide to restricted mean survival time for clinical researchers. Covers what RMST estimates, when it beats the hazard ratio, how to choose the time horizon, and how to report results clinicians can actually interpret.
Informative Censoring: When Dropout Is Part of the Bias
A practical guide to informative censoring for clinical researchers. Covers loss to follow-up, treatment discontinuation, database exit, inverse probability of censoring weights, and why dropout can bias survival and causal estimates when it depends on prognosis.
Competing Risks: When Kaplan-Meier Tells the Wrong Clinical Story
A practical guide to competing risks for clinical researchers. Covers death and discharge as competing events, why Kaplan-Meier can overstate event probability, and how cause-specific hazards and cumulative incidence answer different clinical questions.