"There are three kinds of lies: lies, damned lies, and statistics"

declared Mark Twain, or Benjamin Disraeli, or somebody.

A common error made by people doing statistical tests is to forget about their underlying assumptions.

Some of the most commonly used statistical tests are based upon sample values taken from populations. These tests make assumptions about the distributions of those populations.  The efficacy of those tests have been proven using parametric equations that describe the assumed underlying distributions, and so they are called parametric tests.

A test performed on samples that do not meet the test's underlying assumptions is meaningless, so much numerical masturbation.

Although different tests make different assumptions, the most common assumption is for a normal distribution of the population values.  Unfortunately, populations that are not normally distributed (especially in the social sciences) appear with alarming frequency.

What happens then when your sample values don't match the assumptions of the test you wanted?  Use a test that doesn't make those assumptions.

One test that does not assume anything about a population's distribution is the chi square1 test; another is the Kolmogorov - Smirnov test, which can be used to determine if a population sample follows a given distribution (such as a normal distribution).

Many parametric tests have counterparts that are based upon the order of the values, rather than the actual values:

These tests are not as powerful as their parametric counterparts when the assumptions are met, but at least you're not generating gibberish when they're not met.

(If you use the SPSS statistical package, a whole hierarchy of commands beginning with NPAR TESTS is available for use).

So, in the future, you're going to test your samples for a normal distribution before performing a parametric test, right?




1If that link doesn't work, try chi-square

Log in or register to write something here or to contact authors.