Introduction to Chi-Square Goodness-of-Fit

Introduces the chi-square goodness-of-fit test for assessing whether observed categorical counts are consistent with a hypothesized probability distribution. Covers the test statistic, degrees of freedom, and the critical-value decision rule for both uniform and non-uniform null distributions.

Step 1 of 157%

Tutorial

The Chi-Square Goodness-of-Fit Statistic

We often want to test whether observed counts in kk categories are consistent with a hypothesized probability distribution. The chi-square goodness-of-fit test compares observed counts O1,O2,,OkO_1, O_2, \ldots, O_k to expected counts E1,E2,,EkE_1, E_2, \ldots, E_k predicted under the hypothesized distribution.

If the hypothesized distribution assigns probability pip_i to category ii and the sample has total size n,n, then the expected count for category ii is

Ei=npi.E_i = n\, p_i.

The chi-square goodness-of-fit statistic is

χ2=i=1k(OiEi)2Ei.\chi^2 = \sum_{i=1}^{k} \dfrac{(O_i - E_i)^2}{E_i}.

Each term measures the squared deviation of an observed count from its expected count, scaled by that expected count. A large value of χ2\chi^2 indicates poor agreement between the data and the hypothesized distribution.

To illustrate, suppose we toss a coin 8080 times and observe 5050 heads and 3030 tails. Under the hypothesis that the coin is fair, the expected counts are

EH=ET=8012=40.E_H = E_T = 80 \cdot \tfrac{1}{2} = 40.

The test statistic is

χ2=(5040)240+(3040)240=10040+10040=5.\begin{align*} \chi^2 &= \dfrac{(50 - 40)^2}{40} + \dfrac{(30 - 40)^2}{40} \\[3pt] &= \dfrac{100}{40} + \dfrac{100}{40} \\[3pt] &= 5. \end{align*}
navigate · Enter open · Esc close · ⌘K/Ctrl K toggle