standard deviation - Everything2.com

by dorward

Sat Apr 01 2000 at 13:09:18

A parameter that indicates the way in which a probability function or a probability density function is centred around its mean.

In other words how far away (on average) a result is from the mean.

To calculate:

Add together all items
Divide the total by the number of items
This is the mean
Subtract each item from the mean
Multiply each result by itself
Add all these results together
Divide by the total number of items
This is the Variance
The square root of the Variance is the standard deviation

The smaller it is (in proportion to the mean) the more likely the mean is to be a useful result and the less likely there are to be a few very high or very low results making the mean a bad representation of the data.

I like it!

2 C!s

(idea)

by Professor Pi

Mon Jan 08 2001 at 16:46:47

There are two commonly used definitions of the standard deviation.

The population standard deviation:

σ_x=(1/N × Σd_i²)^½

and the sample standard deviation:

σ_x=(1/(N-1) × Σd_i²)^½

where: σ_x is the standard deviation, N is the number of measurements, and d_i is the difference between measurement i and the mean.

Both methods are frequently used. The sample standard deviation corrects the tendency to understate the uncertainty in measurements, especially when the sample size is small. This can be understood for the extreme case where there is only one measurement available (N=1). In this case the population standard deviation gives the absurd result σ_x=0, whereas the sample standard deviation is undefined (0/0); a proper representation of the ignorance of the standard deviation after just one measurement.

For practical purposes, the difference between the two definitions is insignificant. However, when you calculate the standard deviation, you need to be aware of the two definitions and report the one you are using. Also keep this in mind when you are using the standard deviation function in your favorite spreadsheet; e.g. Excel uses the STDEV function for the sample standard deviation and STDEVP function for the population standard deviation.


	Measurement	Measured value		Deviation		 d_i²
	    i			x_i		 d_i=x_i-x_mean

	    1			17		  0.8		0.64
	    2			15		 -1.2		1.44
	    3			16		 -0.2		0.04
	    4			17		  0.8		0.64
	    5			16		 -0.2		0.04
	    			__				____
	    		x_mean =	16.2			Σd_i²=	2.80

In this example, the population standard deviation is: σ_x=(1/5 × 2.80)^½=0.75, and the sample standard deviation is: σ_x=(1/4 × 2.80)^½=0.84.

Use the sample standard deviation for calculating confidence intervals.

I like it!

4 C!s

(idea)

by Oneiromancer

Wed Feb 20 2002 at 20:39:46

Here is a formula for the standard deviation (from the mean), given a normalized density distribution f(x). Note that if the density distribution is valid, N is essentially infinite so the two versions of the standard deviation are essentially identical.

First, we will need to establish how to take averages of things in density distributions. For any distribution on x 'foo', and a property 'baz' which is a function of x,
Average(foo(x), baz(x)) = ∫foo(x)*baz(x) dx
with the integral taken on the whole domain of foo*.

The Standard Deviation of foo(x) = √ ((Average(foo(x), x²) - Average(foo(x), x)²)

Note that this is functionally identical to the infinite case of the formulation given in Professor Pi's writeup, but completely mangled so that it is easy to compute and difficult to see why it works.

*If you don't see why that works, consider the simplest nontrivial case, Average(f(x), x). Just think of it as grouping the elements together when adding them up: let's suppose f(3) = 2 so we add two instances of 3 (i.e. 2*3) to the running total. Fortunately, since foo(x) is normalized, we've already divided by the number of points in the distribution. If you test this with an arbitrary function, make sure that it is normalized. For example, f(x) = x on the domain (0,1) doesn't work. However, f(x) = 2x on the same domain does.

Also note that aside from the standard deviation from the mean (which is the subject of the node up to this point) there is, distinctly, the standard deviation of the mean. This is used when you are sampling a distribution and are attempting to determine how well you have constrained the mean of that distribution from your sampling.

Also, you can calculate the standard deviation from any other statistical estimate one can imagine, not just the mean. The standard deviation from the median, for example. Just take the mean-under-quadrature of the difference.

I like it!

90% of people think they are of above average intelligence	Statistics every writer should know	Confidence Interval	Variance
Grading on a curve	999,999,999,999,999 Bottles of Beer on the Wall	Σ	Smallest number greater than 0
Putting God to the test	Gaussian Distribution	Standard deviation from the median	z-score
normal distribution	statistical clustering	t-test	mean
Incentivizing antisocial behavior	Gaussian curve	The Bell Curve	AOLserver
IQ	Brian Eno