A more rigorous definition of a
probability density
function (from a
mathematical instead of a
statistical point of view--
Statistics is just a manipulation of probability):
First of all, in order to have a
density function, the random variable that it describes must be
continuous.
A random variable
X is continuous if its
probability distribution function, F(x) = P(X <= x) (the probability that a random variable
X is less than or equal to some
value of X, represented by x.) can be written as:
F(x) = from (-infinity, x) ∫ f(u) du
for some integrable f:
R -> (0, infinity).
f is called the probability density function of
random variable X.
The density function of F is not prescribed uniquely by this integral, since two
integrable functions which take identical values except at some specific point
have the
same integrals. However, if F is differentiable at u, then we will normally set f(u) = F'(u).
So what does probability have to do with this? Remember, since X is a continuous random variable, it is just that: continuous. For example, if the RV
X is continuous (can take any value) between
0,5, the Probability that X = 3 is zero. There are infinitely many
values in (0,5), so a particular one has
probability 0.
However, one can find the probability that the value is between certain values, a and b by taking the integral:
1) P(a <= X <= b) = (a to b)∫ f(x) dxR such as the interval (more on this below).
Another property, mentioned above, is that:
2) (-infinity to +infinity) ∫ f(x) dx = 1.
But...why does this characterize density functions?
You had to ask. Let
J bet he collection of all
open intervals in
R.
J can be extended to a unique smallest
σ field B = σ(
J) which contains
J;
B is called the
Borel σ-field and contains
Borel Sets. B is a member of
B. Setting Px(B) = P(x member B), we can check that (
R,
B, Px) is a probability space. Secondly, suppose that f:R->
0, infinity (mapping onto the set of
real numbers) is integrable and (2).
So for any B in
B, we define
P(B) = (over B) ∫ f(x) dx
Then (
R,
B, Px) is a probability space and f is the density function of the
identity random variable X:
R ->
R given by X(x) = x for any x member
R.
Now, the
standard normal distribution is a perfectly good example of a
continuous random variable, but it is, despite what
statisticians want you to think, not the only one. The
exponential distribution is very useful, as well as the
gamma distribution,
Cauchy distribution, the
Beta distribution, and the
Weibull distribution.