Skip to content
Panel Dataintermediate

Pooled OLS on Panel Data

Pooled OLS is the naive baseline for panel data: stack all NTNT observations and run ordinary OLS as if the panel structure did not exist. Start from the cluster spine yit=xitβ+ai+uity_{it}=x_{it}'\beta+a_i+u_{it}, where aia_i is an unobserved, time-constant entity effect. Pooled OLS folds that effect into a composite error vit=ai+uitv_{it}=a_i+u_{it} and estimates yit=xitβ+vity_{it}=x_{it}'\beta+v_{it}. This is fine only when aia_i is uncorrelated with the regressors. The moment Cov(ai,xit)0\text{Cov}(a_i,x_{it})\neq 0, the estimator is biased and inconsistent for β\beta, because the omitted variable aia_i now sits inside the error and is correlated with xitx_{it}.

Why it matters

Picture one regression line drawn through three firms pooled together. That single line ignores who each firm is. If high-quality firms (a large aia_i) also tend to carry more leverage, the pooled slope on leverage absorbs the firm-quality difference and no longer measures the within-firm effect. There is a second, separate problem even when the slope happens to be consistent: because aia_i is the same in every period for a given firm, the composite errors vitv_{it} are serially correlated within an entity, so the default OLS standard errors are wrong and overstate precision. Fixed effects fixes the bias by differencing aia_i away, clustered standard errors fix the inference.

Formulas

Pooled model with composite error
yit=xitβ+vit,vit=ai+uity_{it}=x_{it}'\beta+v_{it}, \qquad v_{it}=a_i+u_{it}
Pooled OLS regresses yity_{it} on xitx_{it} over all NTNT stacked observations, treating the entity effect aia_i as part of the error vitv_{it} rather than estimating it.
Consistency condition
plimβ^POLS=β    Cov(ai,xit)=0\text{plim}\,\hat{\beta}_{POLS}=\beta \iff \text{Cov}(a_i,x_{it})=0
If aia_i is correlated with any regressor, β^POLS\hat{\beta}_{POLS} is biased and inconsistent. This is omitted-variable bias with the unobserved aia_i as the omitted variable.

Worked examples

Scenario

Estimate how leverage relates to firm profitability across a panel of firms in Stata.

Solution

On the stacked firm-year data, pooled OLS is simply `regress roa lev size`. Read the slope on `lev` as the cross-firm association in pooled data. The estimate is suspect if unobserved firm quality aia_i (management, governance, niche) is correlated with leverage, since that quality lands in the error and biases the coefficient. There is no panel option here on purpose: `regress` ignores the firm identifier entirely.

NoteDeclaring the panel first with `xtset firm year` does not change `regress`. It only matters once you move to `xtreg`.
Scenario

Show why the same data needs a fixed-effects estimator instead.

Solution

Compare pooled OLS with within estimation: run `regress roa lev size`, then `xtreg roa lev size, fe`. If the `lev` coefficient shifts materially, the gap is evidence that aia_i was correlated with leverage and that pooled OLS was biased. The `fe` estimator removes aia_i by demeaning each firm, so it identifies the within-firm effect that pooled OLS could not.

NoteA large pooled-versus-FE gap is the practical signal that the entity effect mattered.

Common mistakes

  • Believing pooled OLS is always unbiased because OLS is unbiased in a single cross section. With panel data the relevant error is vit=ai+uitv_{it}=a_i+u_{it}, and aia_i correlated with xitx_{it} makes the estimator biased and inconsistent.
  • Thinking the only problem with pooled OLS is wrong standard errors. The deeper issue is bias in β\beta from the omitted aia_i. Serial correlation in vitv_{it} is a second, separate problem about inference, not about the point estimate.
  • Assuming default `regress` standard errors are valid because the sample is large. Even with consistent slopes, aia_i persists across periods, so vitv_{it} is serially correlated within an entity and the default SEs are wrong without clustering.
  • Treating pooled OLS and fixed effects as interchangeable when TT is small. They estimate the same β\beta only when Cov(ai,xit)=0\text{Cov}(a_i,x_{it})=0. When that fails, only the within or first-difference estimators remove the bias.

Revision bullets

  • Pooled OLS stacks all NTNT observations and runs `regress y x`, ignoring the panel structure.
  • The entity effect aia_i goes into the composite error vit=ai+uitv_{it}=a_i+u_{it}.
  • Consistent iff Cov(ai,xit)=0\text{Cov}(a_i,x_{it})=0, otherwise biased like omitted-variable bias.
  • Because aia_i persists across periods, vitv_{it} is serially correlated within an entity, so default SEs are wrong.
  • Fix the bias with fixed effects (demeaning), fix the inference with clustered standard errors.
  • A large pooled-versus-FE coefficient gap signals that aia_i was correlated with the regressors.

Quick check

On a stacked firm-year panel you run `regress roa lev size`. Unobserved firm quality aia_i is positively correlated with leverage. The pooled OLS slope on `lev` is:

Even if pooled OLS happened to be consistent for β\beta, why are its default standard errors generally wrong on panel data?

Connected topics

Sources

  1. Wooldridge (2019), Ch. 13
    Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.
    Pooling cross sections and two-period panels: shows the composite error ai+uita_i+u_{it} and why pooled OLS is inconsistent when the entity effect is correlated with the regressors.
  2. Wooldridge (2019), Ch. 14
    Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.
    Advanced panel methods: contrasts pooled OLS with fixed-effects and random-effects estimation and discusses within-entity serial correlation and clustered standard errors.
How to cite this page
Dr. Phil's Quant Lab. (2026). Pooled OLS on Panel Data. Derivatives Atlas. https://phucnguyenvan.com/concept/efm-pooled-ols
Next concept
Panel Data: Cross-Section meets Time Series
Share this page
Built by Dr. Phuc V. Nguyen ·Follow on LinkedInWork with PhilEmail