The ratio of success of several sub-populations can be reversed in the ratio of success of the population as a whole.

Or, to put it as simply as possible, even when a set of ratios or percentages agree with each other, when added together the result may contradict each of the previous results.

An example that may make matters clearer. Imagine that Alice and Betty are both jackalope hunters. Last week, Alice saw four jackalopes, and caught none of them. Betty was more fortunate, and saw seven jackalopes, of which she caught one. This week, they each did a bit better:

Last Week This Week Total
Alice 0/4
-
(0%)
5/6
-
(83%)
5/10
-
(50%)
Betty 1/7
-
(14%)
3/3
-
(100%)
4/10
-
(40%)

As you can see, Betty had better percentages both weeks, but when the weeks are added together Alice has a better overall percentage.

The paradox arises from taking relative data - in this case, percentages - as our measure, but then comparing them as if they were absolute values. While 3/3 does sound good, especially when phrased as 100%, in some ways it is not as good as 5/6. If you only see each amount as a percentage then each amount, including the total amount, may be misleading. Of course, this is not the fault of the data, but rather those interpreting the data.

This is not a case of especially tricky percentages, but simply a case selected to show how bad our brains are at understanding percentages. The confusion becomes much greater when the numbers become bigger; while most of us understand the ratios of 5/10 or 3/3 very well, very few of us bother to process 210/700 in our heads in a meaningful manner.

More generally, Simpson's paradox occurs when a relevant variable is ignored -- this is called the lurking variable or confounding variable. In the jackalope case, the lurking variable is simply the variable sample sizes. In this case we should take the pooled data (the total) as a clear indicator that Alice is the better hunter... at least, until we can get an even larger sample. In other cases, matters may become more obfuscated. For example, we could have set up an example like this:

Last Week This Week Total
Alice 3/12
-
(25%)
6/8
-
(75%)
9/20
-
(45%)
Betty 2/7
-
(29%)
3/3
-
(100%)
5/10
-
(50%)

In this case, Betty comes out ahead in all measures, but has caught only 5 Jackalopes, while Alice has caught 9. This would not qualify as an example of Simpson's paradox, because pooling the results did not result in a reversal of the apparent trend. However, it does illustrate that it is easy to misinterpret simple percentages when there aren't any odd reversals popping up to show us that something is wrong.

In this case, the percentages show "the percentage of times that a jackalope was both seen and caught". However, it hides the information "who finds the most jackalopes" and "who catches the most jackalopes", both of which may be more relevant to the jackalope industry. Or to put it another way, the percentages are correct in showing that Betty is the more efficient jackalope hunter, but they do not show that she is the more effective jackalope hunter.

As you can see, Simpson's paradox is neither a paradox, nor a particularly special case as far as misunderstanding data goes. However, the above examples are extremely limited in complexity, and Simpson's paradox, along with other forms of statistical confusion, also appear in much more complex measures of correlation, and may be quite tricky to unravel.


This effect is also known as the Yule–Simpson effect (it was originally discovered by Udny Yule in 1903, but not formally described until Edward H. Simpson published a technical paper on it in 1951), and less commonly, the 'reversal paradox', and the 'amalgamation paradox'.