Introduction to Chi-Square Goodness-of-Fit

Introduces the chi-square goodness-of-fit test for assessing whether observed categorical counts are consistent with a hypothesized probability distribution. Covers the test statistic, degrees of freedom, and the critical-value decision rule for both uniform and non-uniform null distributions.

Step 1 of 157%

Tutorial

The Chi-Square Goodness-of-Fit Statistic

We often want to test whether observed counts in $k$ categories are consistent with a hypothesized probability distribution. The chi-square goodness-of-fit test compares observed counts $O_1, O_2, \ldots, O_k$ to expected counts $E_1, E_2, \ldots, E_k$ predicted under the hypothesized distribution.

If the hypothesized distribution assigns probability $p_i$ to category $i$ and the sample has total size $n,$ then the expected count for category $i$ is

E_i = n\, p_i.

The chi-square goodness-of-fit statistic is

\chi^2 = \sum_{i=1}^{k} \dfrac{(O_i - E_i)^2}{E_i}.

Each term measures the squared deviation of an observed count from its expected count, scaled by that expected count. A large value of $\chi^2$ indicates poor agreement between the data and the hypothesized distribution.

To illustrate, suppose we toss a coin $80$ times and observe $50$ heads and $30$ tails. Under the hypothesis that the coin is fair, the expected counts are

E_H = E_T = 80 \cdot \tfrac{1}{2} = 40.

The test statistic is

\begin{align*} \chi^2 &= \dfrac{(50 - 40)^2}{40} + \dfrac{(30 - 40)^2}{40} \\[3pt] &= \dfrac{100}{40} + \dfrac{100}{40} \\[3pt] &= 5. \end{align*}