← Back to Blog
PharmacoepidemiologyStudy DesignMethods Critique

Washout Periods: When “New Use” Is Just Old Use with Better PR

May 28, 2026·16 min read

Anas H. Alzahrani, MD PhD MPH

Department of Preventive Medicine and Public Health

Faculty of Medicine, King Abdulaziz University

Plenty of observational drug studies announce a shiny new-user design and then define “new” with the methodological confidence of someone checking whether the fridge is empty by opening it for half a second. No dispensing in the last six months? Excellent. Probably incident use. Probably.

A washout period is the pre-index interval during which no prior exposure is allowed if a patient is to count as a new or incident user. The idea is sensible: you want cohort entry to align with treatment initiation rather than with some later, calmer chapter of the treatment story. The trouble is that washout windows are often chosen by habit, data availability, or vibes rather than by a defensible account of prescribing rhythm, intermittent use, and the clinical question.

The Core Design Rule

A washout period is not a ceremonial moat around cohort entry. It is an identification rule. If it is too short, continuing users leak into the incident-user cohort and you rebuild prevalent-user bias with cleaner formatting. If it is too long, you may exclude realistic treatment restarters, narrow the cohort into a peculiar subset, and quietly change the estimand.

Decision rule:

Choose the shortest washout that can plausibly exclude continuing use for the treatment pattern you are studying, and only if the database can actually observe that entire window.

Or in less diplomatic language: a twelve-month washout inside an eight-month claims history is not a design choice. It is a trust fall.

Why Washout Periods Matter More Than They First Appear

1. They define who is “new”

The washout does not just tidy the baseline. It decides whether the study observes first use, later continuation, or some ambiguous re-entry after a gap in captured treatment.

2. They shape confounding control

A long washout can consume most available lookback history, leaving less room to characterize baseline severity, healthcare utilization, and prior treatment trajectories.

3. They change the clinical question

The right washout for lifetime first initiation is not necessarily the right one for treatment restart, step-up therapy, or episodic medication use.

A Concrete Clinical Example

Imagine a comparative effectiveness study of GLP-1 receptor agonists versus DPP-4 inhibitors in adults with type 2 diabetes. The paper uses a six-month washout and calls everyone with no dispensing in that interval a new user.

Why six months may be too short

Some patients stop and restart therapy after side effects, cost barriers, or formulary changes. A six-month gap may identify a restart cohort while the manuscript keeps saying incident initiation.

Why twelve months may not solve everything

If the database only has thirteen months of history, an elegant twelve-month washout leaves almost no room to measure baseline disease trajectory, monitoring intensity, or prior comparator use.

What the protocol should say instead

State whether the target estimand is first observed initiation in available data or treatment restart, justify the washout from prescribing cadence, and show sensitivity analyses with nearby windows.

Interactive washout triage

Is this washout defining new use, or just flattering the cohort?

Adjust the treatment rhythm, database history, and analytic goal. The tool estimates whether the washout is too short to exclude continuing users, too long for the clinical question, or simply unsupported by the available lookback.

Likely interpretationPlausible washout choice

Use pattern most consistent with this treatment

What is the study really trying to estimate?

Leakage risk

5%

Approximate risk that apparent new users still include continuing users whose prior treatment is hidden by an insufficient gap.

Restriction cost

18%

A rough signal for how much the washout may shrink or over-select the cohort away from the decision context.

Minimum plausible washout

3 months

Based on the treatment cadence and use pattern entered above, shorter windows are likely to mislabel ongoing users as incident users.

Plausible washout choice

The chosen lookback window broadly matches the refill rhythm and use pattern, so the new-user label is at least clinically defensible.

Estimated estimand drift: Closer to first observed treatment initiation among patients with enough prior clean history

You still need to justify why this drug, this database, and this clinical setting support the chosen washout rather than a nearby alternative.

Design quantityValueWhy it matters
Chosen washout6 monthsThis is the rule that determines whether the study sees a patient as a new user or a continuing user.
Available pre-index history12 monthsWithout enough observed history, the washout becomes partly aspirational instead of empirically verified.
History shortfallNoneIf this is nonzero, the database is missing part of the very window used to certify incident treatment.
Practical interpretationPlausible washout choiceThis is the reviewer-facing bottom line the protocol should be able to defend before modeling begins.

How to Pick a Washout Without Pretending the Database Has Better Memory Than It Does

Design situationA sensible instinctWhat can go wrongReviewer question
Chronic maintenance therapyWashout should exceed the longest plausible refill cycle and grace around irregular fills.Short windows relabel continuing users as incident users.How often can stable patients go between fills without truly being off therapy?
Intermittent or episodic treatmentDefine whether the question is first use, restart, or episode initiation.A long washout may over-purify the cohort into an unusual subset of long-gap users.Does the chosen window match the actual treatment rhythm or just a convention from another drug class?
Limited baseline historyKeep the washout fully observable and preserve enough history for confounders.The study can become both under-verified for incident use and under-measured for baseline severity.How much pre-index history remains after the washout to measure the things that drive treatment choice?
Sensitivity analysisShow nearby plausible windows rather than one enchanted number.A single favored window can look suspiciously selected for result behavior.Does the conclusion survive shorter and longer clinically plausible washouts?

Five Failure Modes That Deserve Less Politeness

1. The washout is shorter than ordinary refill behavior

If a patient can plausibly go four or five months between observed dispensings, a three-month washout does not establish new use. It establishes impatience.

2. The paper never distinguishes incident use from restart

Restarters often differ from first-time initiators in prior tolerance, disease trajectory, and clinician expectations. Lumping them together muddies both design and interpretation.

3. The database cannot observe the full washout

This is common and avoidable. If enrollment, EHR continuity, or claims capture begin after the washout has already started, part of the incident-user definition lives offstage.

4. A long washout quietly hollows out baseline measurement

The more history you reserve for proving no prior treatment, the less history remains to characterize disease severity, utilization patterns, prior therapies, and outcome risk.

5. The chosen window is defended only because the estimate looked nicer

Washout sensitivity analysis is supposed to test robustness, not to provide a scavenger hunt for the most flattering hazard ratio in the neighborhood.

Reviewer Red Flags for “Incident User” Claims

  • The washout window is named, but not justified. “We used 180 days” is not a rationale.
  • The treatment pattern is never described. Chronic, intermittent, and episodic therapies do not deserve the same default.
  • Available history is shorter than the washout. The cohort is being certified with missing paperwork.
  • The manuscript says incident use but the protocol behaves like restart. Those are different patients and often different causal questions.
  • No neighboring washout windows are shown. One lucky threshold is not a robustness strategy.
  • Baseline covariates depend on history that the washout already consumed. The design may be incident-clean but confounding-blind.

What Aqrab Should Help Teams Catch

Washout choices are exactly the sort of design detail that gets waved through in protocol review and then determines whether the cohort means what the title claims. This is not glamorous, but it is where a lot of observational credibility leaks out.

Practical takeaway

Before you trust an incident-user cohort, ask three things: how the treatment is actually used, how much history the data truly observe, and whether the chosen washout matches the estimand rather than the analyst’s muscle memory.

If your team wants a faster way to stress-test cohort definitions, reviewer red flags, and estimand drift before the manuscript gets expensive, Aqrab is built for exactly that kind of methods critique. Try it at /try or inspect the workflow ideas on /developers.

The Short Version

Washout periods are not tiny housekeeping variables. They decide who counts as newly treated, what treatment history is being compared, and whether a so-called new-user design still smells suspiciously like a prevalent-user cohort in a fresh coat of paint.

A defensible washout is clinically argued, fully observable in the data, and explicitly tied to the estimand. Anything less is not rigor. It is formatting.

Keep reading

Don't stop at one method.

Good methods judgment comes from contrast. Read the neighboring guides, see where the assumptions diverge, and avoid treating every observational problem like it needs the same hammer.

Browse full archive