hypothesis test - Everything2.com

In statistical analysis, a hypothesis test can be carried out to determine whether or not a given outcome is likely. The procedure involves the determination of two items: the null hypothesis and alternate hypothesis.

The null hypothesis, H₀, is defined as being the possibility that includes equality, i.e. A ≤ B, A = B, or A ≥ B.

The alternate hypothesis, H₁, is the possibility that does NOT include equality, i.e. A < B, A ≠ B, or A > B.

Translating a claim into null and alternate hypotheses requires that you know about one-tailed and two-tailed tests. You should also know about the standard normal distribution. Briefly, a test is two-tailed if we are testing whether a claimed mean is supposed to be exactly equal OR not equal to the population mean, and one-tailed if a claimed mean is (greater than OR less than) AND not equal to the population mean. The "tails" are the parts of the standard normal curve that form the tail-like parts that slope away from the center.

Significance level, or α, is assumed to be .05 if the party asking us to do the hypothesis test hasn't specified it. A larger α allows a greater range of probability (less strict), whereas a smaller α allows a smaller range of probability. If the test is one-tailed, you leave alpha alone, but if it's two-tailed, you have to divide alpha by 2, since alpha is what determines the critical region(s). In the ASCII art curves below, the un-shaded regions indicate the critical regions. If we get an answer that falls within them, it means that our probability is greater than the significance level (α) and we know that the event in question is un-likely.

Note: In the original writeup, I had it written that the critical regions were shaded. This is wrong! The UN-shaded regions, below, are the critical regions!)

Prob.             |
 .50            ..|..
              ..xx|xx..    Right-tailed test
             .xxxx|xxxx|
 .25        .xxxxx|xxxx|.
          ..xxxxxx|xxxx| ..
      ....xxxxxxxx|xxxx|   ....
 .00  -------------------------
IQ:         - <- 100 -> +


Prob.             |
 .50            ..|..
              ..xx|xx..    Left-tailed test
             |xxxx|xxxx.
 .25        .|xxxx|xxxxx.
          .. |xxxx|xxxxxx..
      ....   |xxxx|xxxxxxxx....
 .00  -------------------------
IQ:         - <- 100 -> +


Prob.             |
 .50            ..|..
              ..xx|xx..    Two-tailed test
             |xxxx|xxxx|
 .25        .|xxxx|xxxx|.
          .. |xxxx|xxxx| ..
      ....   |xxxx|xxxx|   ....
 .00  -------------------------
IQ:         - <- 100 -> +

Phew. Glad we have that out of the way. Moving on...

In this example, I will test the probability that most people are capable of correctly estimating their own intelligence. It seems that 90% of people think they are of above average intelligence, so let's see how likely that is. For reference, the mean (average) IQ is 100, and the standard deviation is 15.

The claim we are testing is that the mean (μ) IQ is >100. Since that statement doesn't contain any equivalence, it is the alternate hypothesis. The null and alternate hypotheses have to be complementary, so the null hypothesis is μ ≤ 100.

The chance that ONE person will be of above-average intelligence is 47.3%. (It would be 50% if we were including the chance that a person would have an IQ of 100+, but the proposition states "...ABOVE average intelligence", meaning that we want people with an IQ of 101 or above.)

Oh, there's one more thing. For a hypothesis test, we need to know how many samples there are. We'll interview 100 people, so n = 100.

μ = claimed mean

H₀ = null hypothesis
   = μ ≤ 100

H₁ = alternate hypothesis
   = μ > 100

α = confidence level
  = .01 (usually .05, larger = less strict, smaller = more strict)

n = sample size
  = 100

x = claimed proportion
  = 90/100
  = .9

pˆ = x / n, i.e. the sample proportion, i.e. "P-hat"
   = 90/100
   = .9

p = population proportion in the null hypothesis
  = Probability of one sample being ≤ 100
  = .527

q = Probability of one sample being > 100
  = 1 - p
  = .473 (47.3% mentioned above)
                _________
z = (pˆ - p) / √((p*q)/n)
  = .373 / .049927046778
  = 7.47090052529
  = 7.471 (we round to three significant figures)

That Z-score (7.471) is WAAAAAY far away from the center! If you have read my standard distribution writeup, you know that 99.7% of the sample should wind up within merely THREE Z-scores (i.e. standard deviations) away from the center.

The probability of that is 4.01491131524 x 10^-14. In other words, the probability that you can interview 100 people and find 90 of them with an IQ of 101 or more is .0000000000000401. This is an illustration of the chances AGAINST finding 90 out of 100 people with an IQ over 100 (shaded area = probability):

Prob.             |
 .50            ..|..       {}
              ..xx|xx..     {} Several light years
             .xxxx|xxxx.    {} to the right...
 .25        .xxxxx|xxxxx.   {}
          ..xxxxxx|xxxxxx.. {}   | <- μ > 100 ->∞
      ....xxxxxxxx|xxxxxxxx.{}...|..........................
 .00  ------------------------------------------------------
IQ:         - <- 100 -> +

OK... now remember that we said α = .01? Well, 4.02 x 10^-14 is a LOT smaller than .01. Therefore, it does not fall within the critical region.

By the way, there's ANOTHER way to test this, and that is to find the critical value, CV. This is what marks the start of the critical region(s). We are using more than 30 samples, which means we can use the standard distribution to find CV. (For less than 30 samples, we'd use the Student-T distribution instead.) Since this is a one-tailed test, with an α of .05, and we are using the normal distribution, I can use my trusty TI-83 (or TI-86 with TI's inferential statistics package) to find the "inverse normal". β = 1 - α = .95, so I ask it for invnorm(β and get 1.95. If the Z-score is greater than the critical value, we reject whichever hypothesis we're testing. (That's why we went to the trouble of calculating the Z-score, above.)

The final step is to formally state the conclusion:

"There is sufficient evidence to warrant rejection of the claim that 90% of people interviewed are likely to be of above average intelligence. Therefore, since 90% of people believe that they are of above average intelligence, it can be surmised that most people are overestimating their IQs."

Please note that I am not a statistician by trade, so this writeup is not guaranteed to be 100% correct. I'm pretty sure it is though.

90% of people think they are of above average intelligence	standard distribution	Confidence Interval	Researching the girlie region
z-score	Ig Nobel Prize	Maths for the masses	binomial distribution
Using Bayes' theorem and the Neyman-Pearson Lemma to decide	Poisson distribution	The Whitlams	Occam's Razor
lambda	ASCII art	student