The Central Limit Theorem

The Central Limit Theorem states that the sample mean of iid random variables is approximately normal for large sample sizes, regardless of the underlying distribution. This lesson develops the statement of the CLT, shows how to compute probabilities for sample means, applies the normal approximation to the binomial, and handles sums of iid random variables.

Step 1 of 157%

Tutorial

Statement of the Central Limit Theorem

Let $X_1, X_2, \ldots, X_n$ be independent, identically distributed (iid) random variables with mean $\mu$ and finite variance $\sigma^2$ . The sample mean is

\bar{X}_n = \dfrac{X_1 + X_2 + \cdots + X_n}{n}.

The Central Limit Theorem (CLT) states that for large $n$ , the sample mean is approximately normally distributed:

\bar{X}_n \;\approx\; N\!\left(\mu,\; \dfrac{\sigma^2}{n}\right).

That is, $\bar{X}_n$ has approximate mean $\mu$ and approximate variance $\sigma^2/n$ . The remarkable feature is that this holds regardless of the distribution of the $X_i$ -- it could be skewed, discrete, bimodal -- as long as $\sigma^2 < \infty$ .

A common rule of thumb is that $n \geq 30$ is large enough for the approximation to be useful. (For sample means drawn from a normal population, the result is exact for every $n$ .)

For example, suppose the daily number of customers at a coffee shop has mean $\mu = 200$ and variance $\sigma^2 = 144$ . Then the average daily count over $n = 36$ days has approximate distribution

\bar{X}_{36} \;\approx\; N\!\left(200,\; \dfrac{144}{36}\right) = N(200,\, 4),

so the standard deviation of $\bar{X}_{36}$ is $\sqrt{4} = 2$ .