The Texas Sharpshooter Fallacy is a mistake statisticians can make when considering randomly-distributed data. When you have a map showing a the occurrences of events the data set commonly seems to form clusters, regions with large numbers of data points. However, the question is whether such clusters are statistically meaningful. In statistics, selective selection from results can prove almost anything.

Otherwise known as "shooting the barn" statistics, or in German Zielscheibenillusion ("target illusion"), the analogy is with a marksman of dubious skills in the Lone Star State (no offence to Texans; I believe his origin is irrelevant). Most target shooters show their skill by attempting to shoot as close to a bullseye as possible. However, in an attempt to demonstrate his aim, our less skillful hero fires his gun into the side of a barn, then walks over and draws a target around area where most of his bullets hit. The result: random impacts become highly significant.

The fallacy is due to a form of clustering illusion. In making statistical observations, results will not be distributed with total uniformity but will naturally be sparser in some areas and denser in others, purely by chance. Similarly, if you toss a coin ten times you're likely to get two or three the same in a row, and highly unlikely to alternate head-tail-head-tail-head... The problem arises if a statistician then sees a region on a map where results appear to be clustered, draws a circle around them, and decides the increased occurrence must be non-random and hence significant.

The classic occasion when such faulty reasoning has been made is when dealing with cancer clusters, areas of the country where cancer appears to occur more often than the average. Inevitably such clusters will exist, by statistical principles (helped, of course, because people are not always distributed uniformly), and they look impressive when drawn on maps. But tracking down the environmental cause often proves fruitless, because there is often no environmental reason, Erin Brockovich notwithstanding.

The solution is twofold: firstly, when you see a cluster in any data, study the entire data so you know the likelihood of clusters forming; this will give a clue as to whether the results are random. But more importantly, decide on your hypothesis (what you are testing for) before you perform a test. Don't pick a target after you've fired your gun.

References:

Paul Cox, Glossary of Mathematical Mistakes, "Shooting the Barn' Statistics", http://xocxoc.home.att.net/math/glossary1.htm#Shooting, accessed December 16, 2002.

Robert Todd Carroll, "Texas-Sharpshooter Fallacy", Skeptic's Dictionary, 01/03/02, http://www.skepdic.com/texas.html, accessed December 16, 2002.

Kevin V. Johnson, "Cancer clusters are difficult to nail down", USA Today, Arlington, Apr 13, 1999, reproduced at http://www.heartland.org/archives/environment/nov99/clusters.htm, accessed December 16, 2002. (Also in USA Today's fee-only web archive at usatoday.com)