MNAR Sensitivity Analysis: Because “We Assumed MAR” Is Not a Results Section
Anas H. Alzahrani, MD PhD MPH
Department of Preventive Medicine and Public Health
Faculty of Medicine, King Abdulaziz University
Missing data papers often follow a comforting liturgy: some outcomes were missing, multiple imputation was performed, baseline variables were included, and therefore the analysis was “robust.” That is tidy. It is also often unearned.
If patients disappear for reasons tied to the outcome you care about, then missing at random may be too optimistic. Once that is true, a single MAR analysis is not the answer. It is the baseline scenario from which the real methodological conversation should start.
The First Question: Why Are the Data Missing?
Before choosing software, choose a story. Not a fairy tale. A data-generating story.
| Missingness pattern | What it means | What usually follows |
|---|---|---|
| MCAR | Missingness is unrelated to observed or unobserved data | Rare in real clinical research; complete-case analysis hurts efficiency more than validity |
| MAR | Missingness may depend on observed data, but not on the unseen value after conditioning | Multiple imputation or likelihood methods can be reasonable if the conditioning set is credible |
| MNAR | Missingness still depends on the unseen value itself, even after observed covariates | No amount of polite MAR language rescues you; sensitivity analysis becomes mandatory |
The problem is not that MAR is mathematically illegitimate. The problem is that researchers often invoke it as a ritual rather than defend it as a scientific claim.
When MAR Starts Looking Thin
Outcome-related dropout
Patients who worsen are harder to reach, less likely to complete questionnaires, or transfer care elsewhere. The missingness mechanism is now sniffing around the outcome itself.
Differential loss by treatment arm
If one arm has more adverse effects, burden, or disengagement, the unobserved outcomes may not be exchangeable across arms even after measured adjustment.
Weak predictors in the imputation model
Including age, sex, and baseline labs is not a magic trick if the real missingness driver is symptom burden, treatment fatigue, or a clinician judgment nobody recorded.
Post hoc reassurance
“Results were similar after imputation” sounds strong until you notice the imputation assumed away the very failure mode that worries you.
What Sensitivity Analysis Is Actually Doing
Sensitivity analysis does not identify the truth from data alone. It asks how the result changes when you stop pretending the missing outcomes resemble the observed ones.
A good sensitivity analysis answers one concrete question:
How different would the unobserved outcomes need to be before the clinical conclusion becomes weak, null, or reversed?
That can be approached with delta-adjusted multiple imputation, pattern-mixture models, selection models, best-case/worst-case bounds, or simpler tipping-point exercises. The exact machinery matters less than the core discipline: make the unverifiable assumption visible and stress it on purpose.
Interactive sensitivity check
How bad would the missing outcomes need to be before the treatment effect stops looking good?
This is a simple tipping-point explorer for a binary outcome. It does not replace full pattern-mixture or delta-adjusted multiple imputation, but it is excellent at forcing one honest question: what would the unobserved outcomes have to look like to erase the headline result?
Observed outcomes
Assumptions for the unobserved outcomes
Quick read
The treatment still looks meaningfully beneficial under these missing-outcome assumptions.
- •Sensitivity-adjusted treatment risk: 19.7%
- •Sensitivity-adjusted control risk: 24.3%
- •Risk ratio under these assumptions: 0.81
| Quantity | Value | Why it matters |
|---|---|---|
| Observed-only risk difference | -6.0% | This is the estimate you would tell yourself if you pretended missingness was not trying to start trouble. |
| Sensitivity-adjusted risk difference | -4.6% | This is what the result looks like after you stop assuming the unobserved outcomes are harmless. |
| Treatment-arm missingness | 12.0% | Even modest differential missingness can do real damage when the missing patients are prognostically unusual. |
| Missing treatment-arm risk needed to erase the benefit | 70.7% | If the true event risk among the missing treated patients reaches this level, the apparent benefit vanishes. |
How to use this honestly
- Start with a clinically plausible bad-news scenario. If patients who disappear are usually sicker, their event risk should rise, not magically match the observed group.
- Stress the more fragile arm first. Higher missingness in the apparently better arm is where false reassurance likes to hide.
- Do not oversell the precision. This is a transparent sensitivity sketch, not a full identification strategy.
Decision rule: if modestly worse outcomes among missing patients erase the headline effect, the discussion section should stop using confident causal language.
Clinical Example: A Stroke Rehabilitation Trial with Missing Functional Outcomes
Imagine a pragmatic stroke rehabilitation trial comparing standard therapy with an app-supported home program. Among patients with observed 90-day outcomes, the intervention arm looks better: fewer participants fail to regain independent function.
But the intervention arm also has more missing outcome assessments. Why? Perhaps disengaged patients stopped using the app, missed follow-up calls, and were exactly the people doing worse. If your imputation model contains age, sex, baseline stroke score, and discharge destination but not real-time engagement decline or caregiver strain, MAR begins to wobble.
What a stronger paper would show
- A transparent missingness table by arm, time point, and likely reason
- A primary MAR-based analysis plus at least one clinically plausible MNAR sensitivity analysis
- An explanation of why the chosen sensitivity parameters are plausible, not decorative
- A discussion that softens causal certainty when modest deviations from MAR erase the effect
Decision Rules for Authors and Reviewers
- Do not ask whether imputation was used. Ask whether the missingness story makes MAR believable.
- If the primary outcome is missing in a prognostic way, require sensitivity analysis. Not as a supplement graveyard item. In the main argument.
- Anchor sensitivity parameters to clinical reality. “Missing patients are 10% worse” is weak unless you explain worse than whom, and why.
- Report how conclusions change, not just whether one p-value survives. Clinical inference lives on effect size, precision, and direction, not on a ritual significance badge.
- If modest MNAR departures overturn the result, say that plainly. Fragility is a result too.
Reviewer Red-Flag Table
| If the paper says... | Likely concern | What to ask next |
|---|---|---|
| “Missing data were handled using multiple imputation.” | Handling is not justification; the missingness assumption may still be doing all the heavy lifting. | What makes MAR credible here, and what MNAR sensitivity analysis was performed? |
| “Baseline variables and treatment group were included in the imputation model.” | Observed covariates may not capture why sicker or less adherent patients disappeared. | What unmeasured or weakly measured drivers of missingness remain plausible? |
| “Results were similar after complete-case and imputed analyses.” | Both analyses may share the same optimistic missingness assumption or similar blind spot. | How sensitive is the result to worse unseen outcomes among those lost to follow-up? |
| “We assumed missing outcomes were comparable to observed outcomes after adjustment.” | That sentence is the assumption that needs defending, not a conclusion. | What data, process knowledge, or sensitivity analysis supports that assumption? |
Where Aqrab Fits
Missing data sections are full of polite methodological shortcuts because reviewers are busy and software output looks reassuring. Aqrab is useful when you want the impolite follow-up questions asked consistently: what assumption made this imputation legal, what would break it, and how fragile is the conclusion if the unseen outcomes were worse than advertised?
If you want a manuscript stress-tested before peer review does it less gently, try Aqrab. If you want those critique patterns embedded in your own review workflow, the developer tools are the more scalable route.
The Practical Bottom Line
Multiple imputation is a method, not absolution.
When missingness may depend on the unseen outcome, a single MAR analysis should be treated as the start of the robustness story, not the end of it. The honest paper shows how much hidden bad news the result can tolerate before the conclusion loses its swagger.
If the answer is “not much,” that is not a failure of statistics. It is a useful warning label.
Keep reading
Don't stop at one method.
Good methods judgment comes from contrast. Read the neighboring guides, see where the assumptions diverge, and avoid treating every observational problem like it needs the same hammer.
Jump-to-Reference Imputation: When Missing Outcomes Start Borrowing the Control Arm's Future
A practical guide to jump-to-reference imputation for clinical researchers. Covers what J2R assumes after treatment discontinuation, when it helps sensitivity analysis, and when it quietly answers the wrong estimand.
Last Observation Carried Forward: When Yesterday's Outcome Pretends the Patient Stopped Changing
A practical guide to last observation carried forward for clinical researchers. Covers why LOCF fails as missing-data strategy, how it can exaggerate or dilute treatment effects, and what reviewers should demand instead.
Missing Indicator Method: When an NA Flag Pretends to Be Missing-Data Strategy
A practical guide to the missing-indicator method for clinical researchers. Covers why NA flags fail for confounding control, when they leave residual bias, and what reviewers should demand before trusting a covariate-adjusted result.