Residuals and Residual Plots

Define a residual as the vertical gap between an observed data value and the value predicted by a regression line, compute residuals from a regression equation and data, and use residual plots to judge whether a linear model is appropriate.

Step 1 of 128%

Tutorial

Defining the Residual

A regression line gives a predicted value y^\hat{y} for each value of xx. The actual observed value yy usually differs from y^\hat{y}, and that difference is called the residual.

The residual for a data point (xi,yi)(x_i, y_i) is the observed value minus the predicted value:

ei=yiy^ie_i = y_i - \hat{y}_i

Geometrically, ei|e_i| is the vertical distance from the point to the regression line.

For example, suppose the regression line is y^=3x1\hat{y} = 3x - 1 and one of the observations is (2,7)(2, 7). The predicted value at x=2x=2 is

y^=3(2)1=5,\hat{y} = 3(2) - 1 = 5,

so the residual is

e=75=2.e = 7 - 5 = 2.

The observation sits 22 units above the regression line.

navigate · Enter open · Esc close · ⌘K/Ctrl K toggle