WARNING: Lots of HTML math ahead!

**Forward**

There are a several specific results which are known as central limit theorems, each sometimes referred to as "the" central limit theorem. Here we will focus on one particular version which

A word on notation: Here we will use the notation E(x) to denote the expectation value of a random variable x. There are other conventions in common use, including wedge brackets ⟨x⟩. The symbol i will be used for the basic imaginary number, while j and n will be used for counting indices.

**Theorem**

Consider a set {x_{j}}, j=1,...,N of N independent random variables with expectation E(x_{j}) = μ_{j} and variance E(x_{j}^{2})-E(x_{j})^{2} = σ_{j}^{2}, where the σ_{j} are real and finite. (A specific additional condition on the σ_{j} will be discussed later.) Let σ = (&Sigma_{j}σ_{j}^{2})^{1/2} and define a new variable z = Σ_{j}(x_{j}-μ_{j})/σ as the (scaled and shifted) sum of the x_{j}. Then as N→∞ the distribution of z approaches normal, i.e. p(z) = (2π)^{-1/2}exp[-z^{2}/2] where p(z) is the density function of z.

**Preliminary Definitions**

The characteristic function Φ(k) for a variable x is defined as

Φ(k) = E(exp[ikx]) = ∫exp[ikx]p(x)dx

This is a calculational device for finding the moments E(x), E(x^{2}), etc. as

Φ^{(m)}(0) = i^{m}E(x^{m})

where Φ^{(m)}(k) represents the m^{th} derivative of Φ(k). If we can write these moments as derivatives of Φ(k), we can also do the reverse and write Φ(k) in a Taylor series:

Φ(k) = Σ E(x^{n})(ik)^{n}/n!

The logarithm of the characteristic function is known as the cumulant generating function, defined as

Ψ(k) = ln[Φ(k)] = ΣC_{n}(ik)^{n}/n!

where the C_{n}, known as cumulants, are polynomials in the moments E(x), E(x^{2}), etc. Of special note are C_{1} = E(x) and C_{2} = E(x^{2})-E(x)^{2} = σ^{2}. Note that if we try to evaluate C_{0} the result is always zero, so this term is generally ignored.

**Proof**

Let Φ_{z}(k) and Φ_{j}(k) denote the characteristic functions for z and the x_{j}. Then

Φ_{z}(k) = E(exp[ikz]) = E(exp[ikΣ_{j}(x_{j}-μ_{j})/σ]) = E(Π_{j} exp[ik(x_{j}-μ_{j})/σ]) = E(Π_{j} exp[ikx_{j}/σ] exp[-ikμ_{j}/σ])

As the x_{j} are independent the product can be moved outside the calculation of the expectation; so can the exponential in μ_{j}, as it is a constant. This results in

Φ_{z}(k) = Π_{j} E(exp[ikx_{j}/σ]) exp[-ikμ_{j}/σ] = Π_{j} Φ_{j}(k/σ) exp[-ikμ_{j}/σ]

Now we take the log, to change the characteristic functions into the cumulant-generating functions:

Ψ_{z}(k) = Σ_{j} Ψ_{j}(k/σ) - ikμ_{j}/σ

Substituting the Taylor expansions,

Σ_{n} C_{zn}(ik)^{n}/n! = &Sigma_{j}Σ_{n} C_{jn}(ik/σ)^{n}/n! - ikμ_{j}/σ

Coefficients of like powers of k must be equal on both sides, so we can solve for the C_{zn}. As C_{j1} = μ_{j} and C_{j2} = σ_{j}^{2} we find

C_{z1} = Σ_{j} C_{j1}/σ - μ_{j}/σ = Σ_{j} μ_{j}/σ - μ_{j}/σ = 0

C_{z2} = Σ_{j} C_{j2}/σ^{2} = (Σ_{j} σ_{j}^{2})/(Σ_{j} σ_{j}^{2}) = 1

Now, C_{zn} ∝ 1/σ^{n}, while σ is a sum of N finite σ_{j}, so as N→∞ it should not be surprising that C_{z3} and higher-order C_{zn} approach zero. This is straightforward if we make the simplifying assumption that the x_{j} have equal variance, i.e. that the σ_{j} are all equal. However, there are several sufficient, weaker restrictions which we can impose on the distribution of the x_{j} including the Lyapunov, Lindeberg, and Feller-Lévy conditions; the study and proof of these variants is left to the interested reader. In all cases, we find that C_{zn} = 0 for n>2, so

Ψ_{z}(k) = (ik)^{2}/2! = -k^{2}/2

and

Φ_{z}(k) = exp[-k^{2}/2]

This is the characteristic function of a standard normal distribution; we can verify this by performing an inverse Fourier transform to recover p(z):

p(z) = (2π)^{-1} ∫exp[-ikz]Φ_{z}(k)dk = (2π)^{-1} ∫exp[-ikz-k^{2}/2]dk = (2π)^{-1/2}exp[-z^{2}/2]

Thus z converges to the standard normal distribution, as desired.