Pooled OLS on Panel Data
Pooled OLS is the naive baseline for panel data: stack all observations and run ordinary OLS as if the panel structure did not exist. Start from the cluster spine , where is an unobserved, time-constant entity effect. Pooled OLS folds that effect into a composite error and estimates . This is fine only when is uncorrelated with the regressors. The moment , the estimator is biased and inconsistent for , because the omitted variable now sits inside the error and is correlated with .
Why it matters
Picture one regression line drawn through three firms pooled together. That single line ignores who each firm is. If high-quality firms (a large ) also tend to carry more leverage, the pooled slope on leverage absorbs the firm-quality difference and no longer measures the within-firm effect. There is a second, separate problem even when the slope happens to be consistent: because is the same in every period for a given firm, the composite errors are serially correlated within an entity, so the default OLS standard errors are wrong and overstate precision. Fixed effects fixes the bias by differencing away, clustered standard errors fix the inference.
Formulas
Worked examples
Estimate how leverage relates to firm profitability across a panel of firms in Stata.
On the stacked firm-year data, pooled OLS is simply `regress roa lev size`. Read the slope on `lev` as the cross-firm association in pooled data. The estimate is suspect if unobserved firm quality (management, governance, niche) is correlated with leverage, since that quality lands in the error and biases the coefficient. There is no panel option here on purpose: `regress` ignores the firm identifier entirely.
Show why the same data needs a fixed-effects estimator instead.
Compare pooled OLS with within estimation: run `regress roa lev size`, then `xtreg roa lev size, fe`. If the `lev` coefficient shifts materially, the gap is evidence that was correlated with leverage and that pooled OLS was biased. The `fe` estimator removes by demeaning each firm, so it identifies the within-firm effect that pooled OLS could not.
Common mistakes
- ✗Believing pooled OLS is always unbiased because OLS is unbiased in a single cross section. With panel data the relevant error is , and correlated with makes the estimator biased and inconsistent.
- ✗Thinking the only problem with pooled OLS is wrong standard errors. The deeper issue is bias in from the omitted . Serial correlation in is a second, separate problem about inference, not about the point estimate.
- ✗Assuming default `regress` standard errors are valid because the sample is large. Even with consistent slopes, persists across periods, so is serially correlated within an entity and the default SEs are wrong without clustering.
- ✗Treating pooled OLS and fixed effects as interchangeable when is small. They estimate the same only when . When that fails, only the within or first-difference estimators remove the bias.
Revision bullets
- •Pooled OLS stacks all observations and runs `regress y x`, ignoring the panel structure.
- •The entity effect goes into the composite error .
- •Consistent iff , otherwise biased like omitted-variable bias.
- •Because persists across periods, is serially correlated within an entity, so default SEs are wrong.
- •Fix the bias with fixed effects (demeaning), fix the inference with clustered standard errors.
- •A large pooled-versus-FE coefficient gap signals that was correlated with the regressors.
Quick check
On a stacked firm-year panel you run `regress roa lev size`. Unobserved firm quality is positively correlated with leverage. The pooled OLS slope on `lev` is:
Even if pooled OLS happened to be consistent for , why are its default standard errors generally wrong on panel data?
Connected topics
Sources
- Wooldridge (2019), Ch. 13Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.Pooling cross sections and two-period panels: shows the composite error and why pooled OLS is inconsistent when the entity effect is correlated with the regressors.
- Wooldridge (2019), Ch. 14Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.Advanced panel methods: contrasts pooled OLS with fixed-effects and random-effects estimation and discusses within-entity serial correlation and clustered standard errors.