Heteroskedasticity
Heteroskedasticity means the error variance changes with the regressors, rather than a constant . It violates the homoskedasticity assumption (MLR.5), the last piece of Gauss-Markov. The consequences are sharply bounded: OLS coefficients stay unbiased and consistent, and is unaffected, because none of those properties uses MLR.5. What breaks is inference: the usual standard error formula is wrong, so the reported statistics, statistics, and confidence intervals are invalid. This node is the hub of the violation, test, remedy chain for the variance assumption.
Try it yourself
The error spread grows with x (heteroskedasticity), yet the OLS slope stays unbiased. What breaks is the standard error: the classical SE is invalid here, while the robust (HC1) SE — Stata’s , robust — is asymptotically valid. The point estimate is fixed by construction, so only the SEs move.
regress y x, robust. It is asymptotically valid, not exactly correct, and need not be larger in general — it simply happens to be larger in this rising-variance design.Why it matters
Picture a scatter of household savings against income. At low incomes the points hug the regression line; at high incomes they fan out, because richer households have far more discretion over how much to save. The line through the cloud is still in the right place on average, so the slope estimate is fine. The problem is that OLS, told the spread is the same everywhere, miscounts how precise the slope is. It leans too hard on the noisy high-income points and reports a standard error that no longer matches reality, so every and test built on it is untrustworthy.
Formulas
Worked examples
You regress household saving on income, family size, and age in Stata and want a first read on whether the error variance is constant before trusting the t statistics.
Run `regress saving inc size age`, then plot the squared residuals against fitted values with `rvfplot, yline(0)`. A funnel that widens with fitted saving is the visual signature of heteroskedasticity. Because OLS is still unbiased, the point estimates are usable, but you should not report the default standard errors until you have either tested the variance or switched to robust SEs.
Common mistakes
- ✗Heteroskedasticity biases the OLS coefficients. It does not. Unbiasedness (MLR.1 to MLR.4) and consistency never use the constant-variance assumption, so is still unbiased and consistent. Only the standard errors, and the , , and confidence intervals built from them, are wrong.
- ✗Heteroskedasticity lowers the or distorts goodness of fit. The population and its sample estimate depend on the conditional mean and the variance of , not on whether is constant, so model fit is unaffected.
- ✗A big spread in the residuals always means heteroskedasticity. Large but constant scatter is homoskedastic. Heteroskedasticity is specifically a spread that changes with the regressors, which is why you test it with the squared residuals against the explanatory variables.
- ✗If errors are heteroskedastic, OLS is useless and you must abandon it. Modern practice keeps OLS and simply replaces the standard errors with robust ones, because the estimator itself is still unbiased and consistent.
Revision bullets
- •Definition: depends on the regressors, violating MLR.5
- •OLS stays unbiased and consistent, and is unaffected
- •Gauss-Markov fails, so OLS is no longer BLUE (no longer minimum variance)
- •The usual SE formula is wrong, so , , and CIs are invalid
- •Fix the inference, not the estimator: robust SEs, or WLS and FGLS
Quick check
Under heteroskedasticity, which property of OLS is lost?
A funnel-shaped plot of residuals against fitted values most directly suggests:
Why is heteroskedasticity described as breaking the Gauss-Markov theorem?
Connected topics
Sources
- Wooldridge (2019), Ch. 8Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019. ISBN 978-1-337-55886-0.Chapter 8 opens by showing that heteroskedasticity leaves OLS unbiased and consistent while invalidating the usual variance estimator and the tests built on it.
- Wooldridge (2019), §8.1Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach, 7th ed., Section 8.1, Consequences of Heteroskedasticity for OLS. Cengage, 2019.States the precise consequences: estimators stay unbiased and consistent, but standard errors and test statistics are no longer valid.