decision-makingprobabilitycognitive biasrisk analysisforecasting

The Reference Class Problem: Why You're Comparing Your Situation to the Wrong Precedents

M. Linden M. Linden
/ / 5 min read

Every probability estimate you make is secretly a comparison. When you say a project has a 70% chance of finishing on time, you're drawing, consciously or not, on a pool of similar projects and guessing where yours lands. That pool is your reference class. And if you've chosen the wrong one, your 70% is fiction dressed up as analysis.

A symbolic wooden hand holding a question mark block against a blue background, representing curiosity and inquiry. Photo by Ann H on Pexels.

This is the reference class problem, and it quietly undermines more decisions than most people realize.

What the Problem Actually Is

Philosopher John Venn first noticed this in the 19th century, though the practical implications took another hundred years to fully surface in decision research. The issue is deceptively simple: any single event can be categorized in multiple ways, and each categorization implies a different historical pool, a different base rate, a different probability.

Consider a startup founder estimating their odds of reaching Series B funding. Are they drawing from the class of all startups? (Grim odds.) All funded startups in their sector? (Better.) All startups founded by repeat founders in their geography during a bull market? (Different again.) Each reference class is technically valid. Each produces a different number. The estimate you reach depends almost entirely on which bucket you chose, and most people choose based on what feels intuitively similar, which is not the same as what is statistically relevant.

Daniel Kahneman and Amos Tversky called this tendency to lean on the specific, vivid details of your own case, rather than the cold base rates from a larger class, the inside view. Switching deliberately to a broader historical pool is the outside view. The research is fairly consistent: the outside view produces better-calibrated forecasts. People almost never use it without prompting.

Why We Default to the Wrong Class

Partly it's availability. The cases that come to mind most easily feel most relevant, even when they're not. A hospital administrator estimating the timeline for a new IT rollout will naturally anchor on IT rollouts they've personally witnessed, a sample of maybe three or four projects, heavily filtered by memory and proximity. That's not a reference class; it's anecdote with a veneer of data.

There's also motivated reasoning. Choosing a flattering reference class is easy to rationalize. Your project isn't like those that failed, yours has better leadership, a clearer market, stronger backing. Maybe. But everyone who ever picked a losing reference class said something similar.

And sometimes the right reference class genuinely doesn't exist, or the data is locked behind proprietary walls. This is where the problem sharpens into something more serious than bias: it becomes a question of how to act when your comparison set is either wrong or unavailable.

A More Deliberate Approach

The fix isn't to find the perfect reference class, that's usually impossible. It's to triangulate across several classes, notice where they diverge, and treat that divergence as meaningful signal about your uncertainty.

Here's a rough process worth building into any high-stakes estimate:

graph TD
    A[Define the decision or estimate] --> B{What are the relevant categories?}
    B --> C[Broad class: all similar events]
    B --> D[Narrow class: close analogs only]
    C --> E(Record base rate from broad class)
    D --> F(Record base rate from narrow class)
    E --> G{Do the rates diverge significantly?}
    F --> G
    G --> H[If yes: treat gap as uncertainty range]
    G --> I[If no: use as converging evidence]

When the broad and narrow classes agree, that's a useful signal. When they diverge sharply, don't average them and move on, the gap itself is telling you something about how much your specific situation differs from prior cases.

Philip Tetlock's work on forecasting accuracy adds another layer here. His best forecasters, the so-called superforecasters, explicitly named their reference class before committing to a probability estimate, then adjusted for features of the specific case that genuinely distinguished it. They didn't abandon base rates; they used them as a starting point rather than an afterthought.

The Asymmetry Worth Remembering

Choosing too narrow a reference class tends toward overconfidence. You're essentially saying my situation is unique enough that general patterns don't apply, which might occasionally be true, but is far more often an ego protecting itself from bad odds.

Choosing too broad a class can make everything look equally uncertain, which is its own kind of analytical paralysis.

Neither failure mode is benign. But in most high-stakes contexts, launching products, allocating resources, assessing organizational risk, overconfidence kills more decisions than excessive caution. That asymmetry suggests erring toward broader classes when in doubt, then tightening only when you have genuine, specific evidence that your situation departs from the norm.

Your situation is probably not as unique as it feels. That's uncomfortable. It's also usually the more accurate starting point.

Get Confronting Unknowns in your inbox

New posts delivered directly. No spam.

No spam. Unsubscribe anytime.

Related Reading