Fitted Values and Residuals
For each observation the fitted value is and the residual is , the vertical gap between the actual point and the line. OLS gives three algebraic properties that always hold by construction: the residuals sum to zero, ; they are uncorrelated with the regressor, ; and the line passes through the sample means, so lies on it. Crucially, the residual is not the error , it is the estimated, observable counterpart of the unobservable population error.
Try it yourself
OLS picks the line that minimises the sum of squared residuals, SSR = Σ(yᵢ − ŷᵢ)². Residuals are the vertical gaps from each point to the line. Drag your blue line and try to beat the gold OLS line on SSR.
Why it matters
After fitting the line you can read off, for each person, what the line predicts () and how far the truth sat above or below it (). Because of how OLS is built, those residuals balance out to zero and carry no leftover linear relationship with . The error , by contrast, is the true unobserved gap in the population that you never actually see. The residual is your best in-sample estimate of it, much as a sample mean estimates a population mean.
Formulas
Worked examples
After `regress wage educ`, a student types `predict wagehat` and `predict uhat, residuals` in Stata and then `summarize uhat`.
`wagehat` holds the fitted values and `uhat` holds the residuals . The mean of `uhat` reported by `summarize` is zero (up to rounding), illustrating . A scatter of `uhat` against `educ` shows no linear trend, reflecting . These residuals are estimates of the unobserved errors, not the errors themselves.
Common mistakes
- ✗The residual equals the error term . The error uses the true unknown parameters and is unobservable. The residual uses the estimates and is observable. They coincide only if the estimates equal the true parameters, which essentially never happens.
- ✗Residuals summing to zero means the model fits well. holds for any OLS fit with an intercept, even a terrible one. It is a mechanical property, not evidence of good fit.
- ✗Zero correlation between residuals and confirms the zero conditional mean assumption. is built into OLS by construction and tells you nothing about whether holds in the population.
- ✗The regression line need not pass through the average point. With an intercept, OLS always makes the line go through . This follows directly from .
Revision bullets
- •Fitted value lies on the line
- •Residual is the vertical miss
- • and by construction
- •The fitted line always passes through
- •Residual estimates, but is not equal to, the error
Quick check
What is the key difference between the residual and the error ?
In an OLS regression with an intercept, which statement is always true by construction?
Connected topics
Sources
- Wooldridge (2019), Ch. 2.3Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning, 2019. ISBN 978-1-337-55886-0.Section 2.3 defines fitted values and residuals and lists the algebraic properties of OLS, including the two summation conditions and passage through the sample means.
- Wooldridge (2019), §2.5 (errors vs residuals)Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning, 2019.Clarifies the distinction between the unobservable error and the computed residual.