Causal InferenceEstimandsStudy Design

Estimands: The Causal Question You Should Define Before Running the Analysis

May 1, 2026·16 min read·By Coefficients Health Analytics

Most causal arguments fall apart before the model runs. Not because the regression was wrong, but because the research team never pinned down the exact effect they wanted. They say they are estimating “the effect of treatment” when they actually mean one of several different questions.

My take is blunt: if the estimand is vague, the analysis is performative. You cannot judge whether a method is correct if the causal question itself is still fuzzy.

What an Estimand Actually Is

An estimand is the precise treatment effect you want to estimate. It forces you to define the population, treatment strategies, outcome, how you will handle events after treatment starts, and the summary measure used to compare groups.

Core idea:

The estimand is the question. The estimator is the tool. People love arguing about tools while quietly skipping the question.

This is why two studies can analyze the same treatment and the same dataset yet reach different conclusions without either one being mathematically wrong. They may be targeting different effects.

The Fast Intuition

Suppose you are studying a new antihypertensive drug. Are you asking what happens if patients are assigned to start it, regardless of later discontinuation? Or what happens if they actually stay on it as intended? Or what happens before rescue medication gets added?

Those are not wording tweaks. They are different causal questions, and they lead to different design choices, censoring rules, and interpretations.

The Five Pieces You Need to Specify

Component	What it answers	Typical failure
Population	Who is the effect for?	Eligibility is vague or changes after analysis starts.
Treatment strategies	What interventions are being compared?	“Treatment” is defined loosely or with future information.
Outcome	What endpoint matters?	Composite outcomes hide clinically different events.
Intercurrent events	What do we do with discontinuation, switching, rescue therapy, death?	Post-baseline events are handled ad hoc after seeing the data.
Summary measure	Risk ratio, risk difference, hazard ratio, mean difference?	Measure is chosen for convenience, not because it matches the decision problem.

The Part Everyone Screws Up: Intercurrent Events

Intercurrent events are things that happen after treatment starts and affect either exposure, outcome interpretation, or both. Treatment discontinuation, switching, transplant, rescue medication, pregnancy, and death are classic examples.

Most papers deal with these badly. They either ignore them, censor them thoughtlessly, or bury them in supplement language. But how you handle them is not a technical footnote. It defines the estimand.

If one team censors at treatment switching and another keeps follow-up regardless of switching, they are not estimating the same thing. Full stop.

The Common Estimand Strategies

Treatment policy

Estimate the effect of starting treatment regardless of later deviations. This is closest to intention-to-treat logic.

Hypothetical

Estimate what would happen if an intercurrent event did not occur, such as no rescue medication or no discontinuation.

Composite

Fold the intercurrent event into the endpoint itself, for example death or treatment failure as a combined outcome.

While on treatment

Restrict the effect to the period before discontinuation or switching. Useful sometimes, but easy to bias if handled naively.

Why This Matters in Observational Research Even More

Trial people talk about estimands because ICH E9(R1) made them. Observational researchers should care even more, because they already have extra ambiguity around time zero, eligibility, adherence, switching, and censoring.

Target trial emulation is basically estimand discipline applied to observational data.
Clone-censor-weight methods exist because per-protocol-style questions need explicit handling of deviation.
Immortal time bias often starts when the treatment strategy was never defined sharply enough.

If your observational study has a loose treatment definition, a hand-wavy censoring rule, and a vague causal claim, the estimand problem is not academic. It is the main problem.

A Clinical Example

Imagine comparing early invasive versus conservative management after non-ST elevation acute coronary syndrome.

Estimand A

Effect of being assigned an early invasive strategy at baseline, regardless of later crossover or nonadherence.

Estimand B

Effect if patients actually followed their initially chosen strategy through a prespecified period without crossover.

A is closer to a policy question. B is closer to a biological or adherence-dependent question. Both can matter. But mixing them in the same paper is how interpretation turns into mush.

What Good Papers Do

State the causal contrast plainly

One sentence should tell the reader exactly what happens under strategy A versus strategy B, in whom, over what follow-up.

Define time zero with discipline

Eligibility, treatment assignment, and follow-up start should align. If they do not, your estimand is probably drifting already.

Pre-specify intercurrent event handling

Do not wait to see treatment switching patterns and then decide whether to censor, ignore, or redefine the endpoint.

Match method to estimand

Treatment-policy questions, per-protocol questions, and hypothetical questions often require different designs and analytic strategies.

Reviewer Red Flags

The paper says “effect of treatment” without defining what happens after discontinuation, switching, or rescue therapy.
Eligibility and exposure are defined using future information.
Censoring rules appear only in the statistical appendix and clearly were not part of the design logic.
The discussion interprets a while-on-treatment estimate as if it were an intention-to-treat policy effect.
Different tables in the same paper quietly target different causal questions.

The Practical Bottom Line

Estimands sound abstract until you realize they are really just a demand for intellectual honesty. What exactly are you trying to estimate? For whom? Under what treatment behavior? With what follow-up logic?

Answer that first. Then pick the design and estimator that serve it. Not the other way around. Because when the causal question is sloppy, the rest of the analysis is just polished confusion.

Keep reading

Don't stop at one method.

Good methods judgment comes from contrast. Read the neighboring guides, see where the assumptions diverge, and avoid treating every observational problem like it needs the same hammer.

Browse full archive

Related guides

Pharmacoepidemiology

Prevalent-User Bias: When Your Drug Study Starts After the Interesting Harm Already Happened

A practical guide to prevalent-user bias for clinical researchers. Covers depletion of susceptibles, survivor selection, post-treatment baseline covariates, and what reviewers should demand before trusting late-entry treatment cohorts.

2026-05-18 · 16 min read

Target Trial Emulation

Clone-Censor-Weight: The Target Trial Fix That Still Breaks When You Use It Casually

A practical guide to clone-censor-weight for clinical researchers. Covers when the design is needed, how cloning and artificial censoring work, where immortal time bias reappears, and what reviewers should demand before trusting a target trial emulation.

2026-05-16 · 16 min read

Estimands

Per-Protocol Effects: The Estimand Everyone Wants and the Bias Trap They Usually Build

A practical guide to per-protocol effects for clinical researchers. Covers sustained-adherence estimands, naive as-treated failure, selection bias after protocol deviation, and what reviewers should demand before trusting per-protocol claims.

2026-05-11 · 15 min read

Previous guide

← Time-Varying Confounding: When Yesterday's Treatment Changes Today's Confounder

Next guide

G-Computation: Predict the Outcome Under Each Treatment Strategy →