Concrete illustration of the central limit theorem

This article illustrates the central limit theorem via an example for which the computation can be done quickly by hand on paper, unlike the more computing-intensive example in the article titled illustration of the central limit theorem. Suppose the probability distribution of a random variable X puts equal weights on 1, 2, and 3:


 * $$X=\left\{\begin{matrix} 1 & \mbox{with}\ \mbox{probability}\ 1/3, \\

2 & \mbox{with}\ \mbox{probability}\ 1/3, \\ 3 & \mbox{with}\ \mbox{probability}\ 1/3. \end{matrix}\right.$$

The probability mass function of the random variable X may be depicted thus:

o   o    o    - 1   2    3

Clearly this looks nothing like the bell-shaped curve.

Now consider the sum of two independent copies of X:


 * $$\left\{\begin{matrix}

1+1 & = & 2 \\ 1+2 & = & 3 \\ 1+3 & = & 4 \\ 2+1 & = & 3 \\ 2+2 & = & 4 \\ 2+3 & = & 5 \\ 3+1 & = & 4 \\ 3+2 & = & 5 \\ 3+3 & = & 6 \end{matrix}\right\} =\left\{\begin{matrix} 2 & \mbox{with}\ \mbox{probability}\ 1/9 \\ 3 & \mbox{with}\ \mbox{probability}\ 2/9 \\ 4 & \mbox{with}\ \mbox{probability}\ 3/9 \\ 5 & \mbox{with}\ \mbox{probability}\ 2/9 \\ 6 & \mbox{with}\ \mbox{probability}\ 1/9 \end{matrix}\right\} $$

The probability mass function of this sum may be depicted thus:

o         o    o    o     o    o    o    o    o     2    3    4    5    6

This still does not look very much like the bell-shaped curve, but, like the bell-shaped curve and unlike the probability mass function of X itself, it is higher in the middle than in the two tails.

Now consider the sum of three independent copies of this random variable:


 * $$\left\{\begin{matrix}

1+1+1 & = & 3 \\ 1+1+2 & = & 4 \\ 1+1+3 & = & 5 \\ 1+2+1 & = & 4 \\ 1+2+2 & = & 5 \\ 1+2+3 & = & 6 \\ 1+3+1 & = & 5 \\ 1+3+2 & = & 6 \\ 1+3+3 & = & 7 \\ 2+1+1 & = & 4 \\ 2+1+2 & = & 5 \\ 2+1+3 & = & 6 \\ 2+2+1 & = & 5 \\ 2+2+2 & = & 6 \\ 2+2+3 & = & 7 \\ 2+3+1 & = & 6 \\ 2+3+2 & = & 7 \\ 2+3+3 & = & 8 \\ 3+1+1 & = & 5 \\ 3+1+2 & = & 6 \\ 3+1+3 & = & 7 \\ 3+2+1 & = & 6 \\ 3+2+2 & = & 7 \\ 3+2+3 & = & 8 \\ 3+3+1 & = & 7 \\ 3+3+2 & = & 8 \\ 3+3+3 & = & 9 \end{matrix}\right\} =\left\{\begin{matrix} 3 & \mbox{with}\ \mbox{probability}\ 1/27 \\ 4 & \mbox{with}\ \mbox{probability}\ 3/27 \\ 5 & \mbox{with}\ \mbox{probability}\ 6/27 \\ 6 & \mbox{with}\ \mbox{probability}\ 7/27 \\ 7 & \mbox{with}\ \mbox{probability}\ 6/27 \\ 8 & \mbox{with}\ \mbox{probability}\ 3/27 \\ 9 & \mbox{with}\ \mbox{probability}\ 1/27 \end{matrix}\right\} $$

The probability mass function of this sum may be depicted thus:

o              o    o    o               o    o    o               o    o    o          o    o    o    o    o          o    o    o    o    o     o    o    o    o    o    o    o    - 3   4    5    6    7    8    9

This not only is bigger in the center than in the tails, but as one moves toward the center from either tail, the slope first increases and then decreases, just as with the bell-shaped curve.

We can quantify the degree of its resemblance to the bell-shaped curve, as follows. Consider


 * Pr(X1 + X2 + X3 &le; 7) = 1/27 + 3/27 + 6/27 + 7/27 + 6/27 = 23/27 = 0.851 851 851 ....

How close is this to what a normal approximation would give? It can readily be seen that the expected value of Y = X1 + X2 + X3 is 6 and the standard deviation of Y is the square root of 2. Since Y &le; 7 (weak inequality) if and only if Y < 8 (strict inequality), we use a continuity correction and seek


 * $$\mbox{Pr}(Y\leq 7.5)

=\mbox{P}\left({Y-6 \over \sqrt{2}}\leq{7.5-6 \over \sqrt{2}}\right) =\mbox{Pr}(Z\leq 1.606602\dots)\approx 0.8555778$$

where Z has a standard normal distribution. The difference between 0.85185... and 0.8556... seems remarkably small when it is considered that the number of independent random variables that were added was only three.