Confidence Intervals for Two Means: Unequal and Unknown Population Variance

Construct a Welch t-confidence interval for the difference between two population means when the population variances are unknown and unequal. Covers the conservative degrees of freedom choice, the Welch-Satterthwaite degrees of freedom formula, and full applications in context.

Step 1 of 119%

Tutorial

Introduction

When the population variances σ12\sigma_1^2 and σ22\sigma_2^2 are unknown, we replace them with the sample variances s12s_1^2 and s22s_2^2 and use a tt-critical value rather than a zz-critical value. The resulting interval is called the Welch tt-interval.

Given independent samples from two populations, a (1α)100%(1-\alpha)100\% confidence interval for μ1μ2\mu_1 - \mu_2 is

(xˉ1xˉ2)±tα/2,dfs12n1+s22n2.(\bar{x}_1 - \bar{x}_2) \pm t^*_{\alpha/2,\, df}\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}.

When the variances are unknown and unequal, this statistic does not follow a tt-distribution exactly. A simple, conservative choice of degrees of freedom is

df=min(n11,n21).df = \min(n_1 - 1,\, n_2 - 1).

This choice gives a wider interval than strictly necessary but requires no extra computation. For example, with n1=8n_1 = 8 and n2=12n_2 = 12, we would use df=min(7,11)=7df = \min(7, 11) = 7.

navigate · Enter open · Esc close · ⌘K/Ctrl K toggle