Simpson's Paradox refers to the reversal of the direction of a comparison or an association when data from several groups are combined to form a single group.

This is best explained by using an example:
(example partially taken from the text The Practice of Statistics, second edition, by David Moore and George McCabe)

Say there are two schools in Michigan - one is for business, the other is for law. Here are their admittance stats:

Law school:

  • 100 males applied, 10 were accepted
  • 300 females applied, 100 were accepted.
  • Female admittance is at about 33% (100/300), while male admittance is only 10% (10/100)
Business school:
  • 600 males applied, 480 were accepted
  • 200 females applied, 180 were accepted
  • Female admittance is at 90% (180/200), while male acceptance is at 80% (480/600)

Each school admits a higher percent of female applicants than male.

However, when the numbers are combined, here is what happens:

A total of 700 males applied to both schools, and 210 were denied.

A total of 500 females applied to both schools, and 220 were denied.

This brings the acceptance percents to 56% for females and 70% for males. When the statistics are combined, the schools appear to favor men.