Prob.             |
.50            ..|..
..  |  ..
.    |    .
.25        .     |     .
..      |      ..
....        |        ....
.00  -------------------------
- <- Sample -> +
Fig. 1 - Curve of the standard normal distribution

In statistics, the probability of any given event is between zero and one, inclusive. In other words, 0 <= P <= 1. The standard normal curve is set up so that the total area under the curve is exactly 1, in order to match up with that principle. The curve stretches out to -infinity on the left and +infinity on the right, approaching zero but never reaching it (i.e. it's asymptotic).

Prob.             |
.50            ..|..
..xx|xx..
.xxxx|xxxx.
.25        .xxxxx|xxxxx.
..xxxxxx|xxxxxx..
....xxxxxxxx|xxxxxxxx....
.00  -------------------------
- <- Sample -> +
Fig. 2 - The total area under the curve is 1

The scale (i.e. width) of the curve, and the center (where P=.50) depends on the mean and standard deviation of the data set. For simplicity's sake, we'll say that the mean (μ) is 0 and the standard deviation (σ) is 1.

The probability that a sample will fall into a certain range is found by determining the area under the curve between the boundaries of that range. So, if you want to find the probability that a sample will land between .4 and .6, you'd find the area under the curve between .4 and .6. This can be done with some calculators (like the TI-83 and certain Casios), or with programs such as Excel, Statdisk, and Minitab. (n.b.: The TI-86 can do this as well, but you need to download the infstats package from www.ticalc.org in the Math section under TI-86 assembly programs.)

With the TI-83, you'd use the normalcdf function like this:

normalcdf(lowerbound, upperbound, μ, σ);

As a matter of fact, the TI-83 will assume that μ=0 and σ=1 if you don't supply them, since that describes the standard distribution for Z scores. So since I stated above I'd use μ=0, σ=1 for this example, to find the area between .4 and .6, you'd type normalcdf(.4,.6 and press Enter (it doesn't mind that you left out the )). The result of that calculation is .070325238362, which we round to .070 or 7%.

Prob.             |
0   .4 .6
.50            ..|..  |  |
..  |  ..|  |
.    |    |  |
.25        .     |    |x |
..      |    |xx|
....        |    |xx|....
.00  -------------------------
- <- Sample -> +

The area under the curve, as bounded between .4 and .6, is slightly more than 7%. I hope my crude ASCII art makes that plain to see! If you were to take normalcdf(-999999,999999, the calculator would return the value 1. Actually it would probably be more like .9999999999999999999, but the calculator's machine epsilon (i.e. precision) isn't high enough so it just returns a 1 (see fig. 2). For this reason, we usually use extreme values like +-999999 or +-999, because the area beyond that is miniscule. In fact, 99.7% of the area of the curve is between -3 and 3, so sometimes we just use μ+-10σ. In fact, the calc says THAT'S equal to 1, so +-10 is probably okay unless you are working with incredibly fine tolerances. YMMV, especially if (μ0, σ=1) doesn't apply to what you're looking at. And if you are looking at something other than Z-scores, μ and σ are probably NOT exactly 0 and 1.

Oh, by the way, the TI-83 also has a function that does the opposite of normalcdf(): invnormal(point,μ,σ). Again, it assumes μ=0, σ=1 if you don't supply them. Point supplies the point along the curve, to the left of which you want to find a probability. So, if you wanted to compute the first quartile (i.e. 25%), you'd enter invnormal(.25. On the TI-86, the function names are shortened but take the same arguments, so you'd use nmcdf() and invnm().

Log in or register to write something here or to contact authors.