Heteroskedasticity-Robust Standard Errors
Heteroskedasticity-robust standard errors (White, also called Huber-Eicker or sandwich SEs) replace the squared error in the variance formula with each observation’s squared residual . The result is valid whatever the form of heteroskedasticity, and it does not require you to know or model the variance function. Because OLS coefficients are already unbiased and consistent under heteroskedasticity, swapping in robust SEs repairs the one thing that was broken, the inference, while keeping the same estimates. In Stata this is one option, `regress y x, robust`. They are the modern default, with the caveat that their justification is large-sample.
Try it yourself
The error spread grows with x (heteroskedasticity), yet the OLS slope stays unbiased. What breaks is the standard error: the classical SE is invalid here, while the robust (HC1) SE — Stata’s , robust — is asymptotically valid. The point estimate is fixed by construction, so only the SEs move.
regress y x, robust. It is asymptotically valid, not exactly correct, and need not be larger in general — it simply happens to be larger in this rising-variance design.Why it matters
The default standard error trusts a promise the data may not keep, that every observation is equally noisy. Robust standard errors drop that promise and instead let each point speak for its own noise, using how far it actually missed the line, , as the local measure of variance. Points in the noisy region get more weight in the uncertainty calculation, points in the tight region get less. You change nothing about the fitted line, you only let the standard errors tell the truth about how precise that line really is.
Formulas
Worked examples
Your wage regression shows a clear funnel in the residual plot and you want valid inference without committing to a model for the error variance.
Estimate `regress lwage educ exper tenure, robust` (or the equivalent `vce(robust)`). The coefficients are identical to plain OLS; only the standard error column, and therefore the statistics and the 95% confidence intervals, change. Report the robust SEs. If a coefficient is significant under robust SEs, that conclusion does not depend on assuming constant variance.
Common mistakes
- ✗Robust standard errors change the coefficient estimates. They do not. OLS produces the same ; the `robust` option only recomputes the standard errors and everything derived from them.
- ✗Robust standard errors fix bias caused by an omitted variable or a wrong functional form. They address only heteroskedasticity in the SEs. If the conditional mean is misspecified, the coefficients are biased and robust SEs cannot rescue them.
- ✗Robust standard errors are always larger than the usual ones. They can be larger or smaller, because they are a reweighting of the residual information, not an inflation factor.
- ✗Robust standard errors are exact in any sample size. Their validity is a large-sample (asymptotic) result, so in very small samples they can be unreliable and a correction or caution is warranted.
Revision bullets
- •Replace with in the variance formula (sandwich form)
- •Valid for any form of heteroskedasticity; no variance model needed
- •Coefficients unchanged, only SEs, , , and CIs are recomputed
- •Stata: `regress y x, robust` or `vce(robust)`
- •Justification is large-sample, so prefer it with a decent
Quick check
Compared with default OLS, adding the `robust` option in Stata changes:
A key advantage of heteroskedasticity-robust standard errors is that they:
Why are robust standard errors often called the modern default?
Connected topics
Sources
- Wooldridge (2019), §8.2Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach, 7th ed., Section 8.2, Heteroskedasticity-Robust Inference after OLS Estimation. Cengage, 2019.Derives the White heteroskedasticity-robust standard errors and the robust t and F statistics, and notes their large-sample justification.
- White, Halbert. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica 48(4): 817-838, 1980.Original derivation of the heteroskedasticity-consistent (sandwich) covariance matrix estimator that underlies Stata’s robust option.