A Cautionary Note on the Central Limit Theorem

The Central Limit Theorem (CLT) is fundamental in probability theory. It states that, given independent samples drawn from a fixed distribution with constant variance, the sampling distribution of their average converges to a normal distribution (central distribution). Furthermore, the standard deviation of this sampling distribution decreases proportionally to the inverse square root of the sample size—meaning that as the number of samples grows, the variance of the sample mean shrinks.

However, in practice, the theorem’s assumptions can be subtle and sometimes overlooked. A classic example illustrates this: imagine estimating the height of the Emperor of China by averaging the answers from a very large sample—one billion people. Suppose each person’s knowledge of the Emperor’s height is accurate within ±1 metre. Naively applying the CLT suggests that by averaging these answers, we could pinpoint the Emperor’s height to within 0.03 millimetres—an absurdly precise estimate.

What’s the catch? The CLT requires logical independence among samples, not merely causal or physical independence. In reality, many people have never seen the Emperor. Their answers are shaped not by direct observation but by information passed through conversations, stories, and folklore. Because this knowledge is socially transmitted, the responses are logically correlated: knowing one person’s answer provides information about others’. This violates the key logical independence assumption underpinning the CLT.

Since probability theory builds on formal logic, its independence condition demands that, given the background information, knowing one observation tells us nothing about another. While the assumption of independent and identically distributed (IID) samples is ubiquitous in statistics, it is vital to understand its true meaning. Though this may seem like subtle semantics or nitpicking, appreciating the distinction has important implications in research fields such as meta-analysis and beyond.

In this context, Edwin Jaynes eloquently quotes Henri Poincaré:

“We all know that there are good and bad experiments. The latter accumulate in vain. Whether there are a hundred or a thousand, one single piece of work by a real master – by a Pasteur, for example – will be sufficient to sweep them into oblivion.”