Critical Thinking

The Tychean Codex: Mediocrestan & Extremistan

To bankrupt a fool, give him information.

“To bankrupt a fool, give him information."

— Nassim Taleb, The Bed of Procrustes: Philosophical and Practical

Note: This is part one of The Tychean Codex, a six-part series drawing from Taleb's Incerto. It aims to synthesize code concepts from the series and build the intellectual foundation for a tail-risk investment strategy.

The code for the Monte Carlo simulations and generating the plots can be found here.

A Cleaver
Measuring Uncertainty
Mediocrestan
Extremistan
Becoming a Turkey
The Tail End
Learn More

A Cleaver

Imagine you are a turkey.

Every morning you wake up in the same hut, and your owner brings food at the same time each day. Each meal arrives as regularly as the sunrise. Day after day, the same pattern of events unfolds and your confidence in them compounds. Eventually, the arrival of food isn't a mere prediction but a certainty, a fact of life. Then one morning your kind, faithful, reliable owner shows up with a cleaver and severs your neck for a holiday feast.

I am the turkey. You are the turkey. As a society, we are the turkey.

Long-Term Capital Management wasn't run by amateurs. Two Nobel Prize-winning economists, former Salomon Brothers partners, and a former Federal Reserve Vice Chairman built models sophisticated enough to generate 40%+ annual returns (four times the S&P 500's historical average). Then Russia defaulted on its domestic debt in August 1998. The spreads LTCM had bet on collapsed simultaneously across every market they were invested in, an event their models had deemed virtually impossible. The fund was worth $4.7 billion at the start of the year. By autumn it was insolvent.

The engineers at the Fukushima Daiichi Plant weren't reckless either. Tsunamis are nothing new to Japan, and the plant's seawalls were designed to handle the largest waves in the historical coastal record. On March 11, 2011, a magnitude 9.0 earthquake sent waves nearly three times the maximum height crashing over them. The backup generators flooded. Cooling failed. Three reactor cores melted down, spilling radioactive waste into the seawater.

In September 2000, Reed Hastings and Marc Randolph flew to Dallas and offered to sell Netflix, a DVD-by-mail startup, to Blockbuster for $50 million. The Blockbuster CEO passed. It's genuinely hard to fault him. Blockbuster had 9,000 stores, had survived every prior disruption to home video, and had the track record to prove it. Ten years later Blockbuster was bankrupt. Netflix was worth $13 billion.

The error in each case wasn't bad data. It was a hidden assumption that the future would be drawn from the same distribution as the past. Just like the turkey, each chose to use their past experience to model the future.

That's what this post is about. The implications of this assumption, when it's true and when it isn't. It draws from Nassim Taleb's Incerto, a work that describes two fundamentally different worlds and why mistaking one for the other is, in a very literal sense, a matter of life and death.

Measuring Uncertainty

Being able to predict the future is a superpower. It gives you the ability to survive, to make better decisions, to avoid death, to get rich. So how do we do it?

Let's start with a crash course in probability. Not in a heady, academic sense. In a practical one. We all learned in grade school that probability is a number between 0 and 1 that tells us how likely an event is to occur. If you flip a fair coin, there is a 50% chance (or .5 chance) that it will land heads.

Practically, this number doesn't tell us the future (you don't really know what the future holds until it occurs; probability is purely theoretical), it tells us about our current ignorance. When you say there's a 50% chance of a coin landing heads, you're not describing a property of the coin, you're describing the limits of your knowledge about what it will do.

A model representing what probability means to us, practically speaking.

A probability distribution is a way to represent all possible outcomes of an uncertain event, and how likely each one is. If probability answers "how likely is this one thing to happen", a probability distribution answers "here is everything that could happen, and the likelihood of each outcome". The probabilities across the whole distribution must sum to 1. Something has to happen. Let's walk through some examples.

For our fair coin toss, there is a fifty-fifty chance of getting heads, or tails. There are two outcomes, each equally likely. The idealized probability distribution looks like this:

The probability distribution for flipping a fair coin.

We can run a Monte Carlo simulation to replicate what would happen if we flipped a coin ten thousand times. This is the distribution from that simulation. Reality closely matches our ideal distribution, but varies slightly:

When you run an experiment in reality, it will never replicate ideal results. The ideal probability is your future expectation, but your distribution represents your ignorance, not the actual future.

Let's look at another case, rolling a fair, six-sided die. Instead of two outcomes (heads/tails) there are six outcomes. The two distributions are shown below:

Our idealized fair die has a roughly 17% percent chance on landing on any given number.

Again, our actual dice roll differs slightly from our idealized distribution. Not by much.

Our last distribution is called a “normal” distribution. You may also know it as the “bell curve” or “Gaussian Distribution”. It shows up everywhere in nature because of a remarkable, mathematical fact: when an outcome is the result of many small independent influences added together, the result tends to mirror this distribution regardless of what those individual influences look like. This pattern shows up everywhere, in places like human height, manufacturing tolerances, IQ scores, annual rainfall, the velocities of gas molecules, and blood pressure readings, to name just a few examples.

A plot of the idealized normal distribution.

This distribution is attractive to anyone modeling data. With only two values, you're able to completely describe the entire distribution, every possible outcome and its likelihood. The mean (or average) tells you where outcomes cluster, and the standard deviation tells you how widely they spread. This is useful for risk modeling, scientific research, engineering, medicine, and virtually any field where the goal is to characterize what is typical, and how far from typical you should expect outcomes to stray.

The equation for a normal distribution. It has only two parameters: mean and standard deviation.

The core things to keep in mind are that probabilities and probability distributions are beliefs. They are models. They represent an unrealized future state. They aren't real. Nobody knows with certainty what the future holds. When we ran our Monte Carlo simulations above, they always deviated slightly from our “idealized” outcome. We can use data to infer what that outcome may be, but it's just that: an inference.

Moreover, the accuracy of that inference depends entirely on the kind of world you are modeling. When effects are independent and additive, statistical tools make incredibly accurate predictions about the future. But not every domain plays by those rules. There are, in fact, two fundamentally different worlds, and the one you're in determines everything.

Mediocrestan

Mediocrestan is the world of predictably random, small variations, the kind of variations that can be easily modeled with a normal distribution or other statistical tools. Taleb illustrates the concept in The Black Swan with the following example:

“Assume you round up a thousand people randomly selected from the general population and have them stand next to each other in a stadium... Imagine the heaviest person you can think of and add him to that sample. Assuming he weighs three times the average, between four hundred and five hundred pounds, he will rarely represent more than a very small fraction of the weight of the entire population (in this case, about a half a percent.)... You can get even more aggressive. If you picked the heaviest biologically possible human on the planet (who yet can still be called a human), he would not represent more than, say, 0.6 percent of the total, a very negligible increase.”

— Nassim Taleb, The Black Swan

The core property of Mediocrestan is that no single observation can dominate the whole.

Domains governed by Mediocrestan:

Human height and weight
Batting averages
Hours of sunlight per day
Flight durations between cities
The average square footage of a home

Take human height as a concrete example. If you took Robert Wadlow, the tallest man ever recorded at nearly 9 feet, and added his height to the current population of human heights, it would move the average by essentially nothing. His height has little to no effect on the distribution. This is because systems that create Mediocrestan-esque behavior contain effects that are additive and independent.

Your final height is the product of thousands of genetic and environmental inputs, things like nutrition, hundreds of gene variants, sleep, and so on. Each contributes a small nudge up or down. Each adds or subtracts a small amount from the final outcome (your height). Similarly, each individual's height nudges the average height of the population up or down slightly.

Your Height = Nutrition + Gene 1 + Gene 2 + … + Sleep
Average Human Height = Person 1 Height + Person 2 Height + … + Person n Height
Final Outcome = Factor 1 + Factor 2 + Factor 3 + … + Factor n

In addition to being additive, each factor is independent. In Mediocrestan, one observation doesn't change the probability or magnitude of the next. Wadlow being almost 9 feet tall tells you nothing about how tall the next person you measure will be. His extraordinary height doesn't have an effect on anyone else. It doesn't pull other heights upward or reshape the distribution around it. It's considered statistically independent.

Mediocrestan is boring, and that's precisely what makes it safe to model. When effects in a system are additive and independent, outliers don't create large changes in the distribution. No single data point can hijack the whole. Statistical models using normal distributions accurately reflect reality, and the future looks reassuringly like the past. Unfortunately, not every domain plays by these rules. There is a second world, one where a single observation can dominate the entire distribution. Taleb calls this world Extremistan.

Extremistan

Unlike Mediocrestan, Extremistan is the world of wild, unpredictable variation, the kind that cannot be captured by a normal distribution and is incredibly difficult to model with our statistical tools. Taleb illustrates the concept in The Black Swan with the following example:

“Consider by comparison the net worth of the thousand people you lined up in the stadium. Add to them the wealthiest person to be found on the planet, say Bill Gates, the founder of Microsoft. Assume his net worth to be close to $80 billion, with the total capital of the others around a few million. How much of the total wealth would he represent? 99.9 percent?... For someone’s weight to represent such a share, he would need to weigh fifty million pounds!”

— Nassim Taleb, The Black Swan

The core property of Extremistan is the exact opposite of Mediocrestan. In Extremistan, a single observation can dominate the whole.

Domains governed by Extremistan:

Supply chain disruptions
War casualties
Word frequency in languages
Hedge fund performance
Pandemic death tolls

Take word frequency as a concrete example. The English language contains roughly 170,000 words, but the ten most common ("the", "of", "and", "to", "a", etc.) and a handful of others account for around 25% of all words ever written. Remove a single one of them and you remove anywhere from 1 - 10% of the entire written language. This would be the equivalent of a single human weighing as much as 80,000 blue whales, or about 3x the combined weight of every ship in the US Navy. Contrary to Mediocrestan, this behavior is caused by systems that contain multiplicative and interdependent effects.

A word's frequency is not the product of thousands of small independent inputs. It is the product of network effects and reinforcing feedback loops (for those interested, refer to Zipf’s Law). Every time a word is used, it becomes more familiar, which makes it more likely to be used again. Usage increases usage. One infected person infects many more, causing the infection to spread. Being rich gives you more money and more lucrative investment opportunities, making you even richer. This is the opposite of the additive effects of Mediocrestan.

Word Frequency = Prior Usage × Familiarity × Cultural Reinforcement × … × Prior Usage
Investment = Returns × Previous Returns × Previous Returns × … × Previous Returns
Final Outcome = Factor 1 × Factor 2 × Factor 3 × … × Factor n

In Extremistan, one observation does change the probability of the next. Observations are not independent, but interdependent. The word "the" being common makes it more likely to appear again, and reinforces the use of other types of words, which makes it more common still. Its dominance is self-reinforcing and correlates strongly with the usage of other words. Its presence warps and reinforces the distribution, concentrating frequency among a small group of words and leaving thousands of others statistically irrelevant. It's the opposite of statistically independent.

Sebastian Steudtner sets the Guiness World Record for surfing the words largest wave at 26 meters (over 80 feet). For comparison, the waves that destroyed the Fukushima plant were about half the hieght, though the same earthquake generated waves 38 meters (over 100 feet) tall. Photo courtesy of Jorge Leal.

Extremistan is, in a word, wild, and very dangerous to model. When effects in a system are multiplicative and interdependent, a single outlier can dominate everything. One single event or data point can hijack the whole distribution. Statistical models using normal distributions fail to reflect this reality, and the future can look nothing like the past.

As the world has grown more connected, information and risk all travel faster and further than before. Events that once stayed local now cascade globally. Over the last few centuries, our species has drifted further from Mediocrestan and closer to Extremistan. Pandemics that once burned through a single city now sweep across continents. The wealthiest people in the world are no longer lords governing a large of land, but industrialists with generational wealth and global investments spanning hundreds of asset classes.

This is the world the turkey, and we, live in.

Becoming a Turkey

Let's return to one example from the introduction: Long-Term Capital Management. The fund's models were built on the assumption that financial markets behave like human height. They assumed that returns are normally distributed, that changes in price are independent events, and that the past is a reliable guide to the future. They had two Nobel laureates in economics on staff and decades of market data to train their models. They had every reason to believe they were correct, and for a while, they were.

Their strategy was built around a concept called mean reversion, the idea that when two related assets drift apart in price, they will eventually converge again. Bet on the convergence, collect the spread, repeat. The logic is similar to betting on height. If you know the average height of a human male is roughly 5'9", and you've just measured Robert Wadlow at 9 feet tall, would you bet that the next person you measure is taller than him, or shorter? In Mediocrestan, extreme observations revert to the mean. The next person is almost certainly closer to average. LTCM was making the same bet, just with bond prices instead of heights.

And for years, the bet paid off. If price divergences are normally distributed, a well-capitalized fund can weather the occasional bad trade. Their models told them the probability of a catastrophic divergence in all prices across all bonds was effectively zero. Because that probability was so small, they borrowed aggressively against it, eventually leveraging every dollar of capital into roughly thirty dollars of exposure.

To put that in perspective, imagine you make $75,000 a year. Every morning, you measure someone between 8 and 9 feet tall and have to make a single bet: will the next randomly chosen person you measure be taller or shorter? The answer is almost always shorter. There are billions of people shorter than 8 feet tall. In fact, 99.9% of human beings that have ever lived are shorter than 8 feet tall. You could play it safe, stake a few thousand dollars from your savings, and earn a modest return. Or you could borrow 30 times your annual salary, $2.25 million, and become a billionaire in half a month. If the world is truly Mediocrestan, the second bet is as safe as the first. That's exactly the logic LTCM followed.

The value of $1,000 invested in LTCM, the Dow Jones Industrial Average and invested monthly in U.S. Treasuries at constant maturity.

What LTCM didn't account for was that financial markets are not governed by Mediocrestan. They are governed by Extremistan. Returns are not normally distributed, they have fat tails, meaning catastrophic events happen far more frequently than a normal distribution would predict. Worse, those extremes are not independent. When Russia defaulted on its domestic debt in August 1998, it didn't cause a single isolated divergence in one specific bond. It caused a global panic. Investors everywhere rushed to safety simultaneously, and every correlation LTCM had built its strategy around broke down at once. Every single position moved against them at the same time. They had borrowed thirty times their capital over and over again on the same certainty that no one taller than Wadlow existed. Now imagine that on the seventeenth day of your height betting game, someone 28 feet tall walked through the door. Because of your borrowing behavior, your entire net worth evaporates and you're left owing the bank millions of dollars with no way to repay them.

This is the turkey problem in its most precise form. LTCM's models had been trained on years of market data, just like the turkey had a long, unbroken history of morning meals. The models said the future would look like the past, that the world's tallest man set an upper bound on what was possible. They were right, until the day Extremistan delivered a single event so far outside the realm of what was conceivable that it shattered their entire distribution. Bond prices that had shifted by small, predictable amounts each day suddenly lurched by untold magnitudes, simultaneously, across every market on earth. The fund lost $4.7 billion in a matter of weeks and required an orchestrated bailout to prevent a global financial collapse. The cleaver arrived, and they had expected another meal.

The Tail End

The core insight of this piece is deceptively simple. Not all uncertainty is created equal. Mediocrestan is the world of additive, independent effects. It's a world where the normal distribution holds and the past is a reasonable guide to the future. Extremistan is something else entirely. It's a world of multiplicative, interdependent effects, where a single observation can change everything. LTCM had plenty of smart people on staff and models of extraordinary sophistication. None of it mattered, because they were assuming Mediocrestan and living in Extremistan. Just like LTCM, the turkey had years of evidence that the farmer was trustworthy. The evidence was real. The inference was fatal.

The consequences of this assumption extend far beyond finance. War, pandemics, famine, supply chain collapse, and forest fires are not freak accidents. They are Extremistan phenomena, governed by multiplicative, interdependent effects that conventional models will consistently underestimate. The cost of that underestimation, measured in human lives and societal stability, can be enormous. Understanding which world you're operating in isn't an academic exercise. It's one of the most practical things one can do.

But knowing that two different worlds exist only gets you so far. It immediately raises a harder question: if Extremistan doesn't follow the bell curve, what does it follow? Do mathematical tools exist to predict and prepare for extreme events? The answer is mixed, and more counterintuitive than most people expect.
That's where we're going next.