Even if pooled OLS happened to be consistent for β, why are its default standard errors generally wrong on panel data?

Because a_i persists across periods, making v_it serially correlated within each entity. The composite error contains the time-constant a_i, which is identical in every period for a given entity. That induces positive serial correlation in v_it within each entity, violating the independence the default OLS variance formula assumes. The default SEs are then biased, usually too small. Clustering the standard errors by entity restores valid inference, which is why the clustered-SE node is the natural next step.

Panel Dataintermediate

Pooled OLS on Panel Data

Q: On a stacked firm-year panel you run regress roa lev size. Unobserved firm quality a_i is positively correlated with leverage. The pooled OLS slope on lev is:

Biased and inconsistent, because a_i is an omitted variable correlated with the regressor. Pooled OLS pushes a_i into the composite error v_it=a_i+u_it. When a_i is correlated with leverage, the error is correlated with the regressor, which is exactly omitted-variable bias, so the estimator is biased and inconsistent for β. Large samples do not rescue it because the violation is in the population moment condition. This is distinct from the separate standard-error problem.

Pooled OLS is the naive baseline for panel data: stack all $NT$ observations and run ordinary OLS as if the panel structure did not exist. Start from the cluster spine $y_{it}=x_{it}'\beta+a_i+u_{it}$ , where $a_i$ is an unobserved, time-constant entity effect. Pooled OLS folds that effect into a composite error $v_{it}=a_i+u_{it}$ and estimates $y_{it}=x_{it}'\beta+v_{it}$ . This is fine only when $a_i$ is uncorrelated with the regressors. The moment $\text{Cov}(a_i,x_{it})\neq 0$ , the estimator is biased and inconsistent for $\beta$ , because the omitted variable $a_i$ now sits inside the error and is correlated with $x_{it}$ .

Why it matters

Picture one regression line drawn through three firms pooled together. That single line ignores who each firm is. If high-quality firms (a large $a_i$ ) also tend to carry more leverage, the pooled slope on leverage absorbs the firm-quality difference and no longer measures the within-firm effect. There is a second, separate problem even when the slope happens to be consistent: because $a_i$ is the same in every period for a given firm, the composite errors $v_{it}$ are serially correlated within an entity, so the default OLS standard errors are wrong and overstate precision. Fixed effects fixes the bias by differencing $a_i$ away, clustered standard errors fix the inference.

Formulas

Pooled model with composite error

y_{it}=x_{it}'\beta+v_{it}, \qquad v_{it}=a_i+u_{it}

Pooled OLS regresses

y_{it}

x_{it}

over all

NT

stacked observations, treating the entity effect

a_i

as part of the error

v_{it}

rather than estimating it.

Consistency condition

\text{plim}\,\hat{\beta}_{POLS}=\beta \iff \text{Cov}(a_i,x_{it})=0

a_i

is correlated with any regressor,

\hat{\beta}_{POLS}

is biased and inconsistent. This is omitted-variable bias with the unobserved

a_i

as the omitted variable.

Worked examples

Scenario

Estimate how leverage relates to firm profitability across a panel of firms in Stata.

Solution

On the stacked firm-year data, pooled OLS is simply `regress roa lev size`. Read the slope on `lev` as the cross-firm association in pooled data. The estimate is suspect if unobserved firm quality $a_i$ (management, governance, niche) is correlated with leverage, since that quality lands in the error and biases the coefficient. There is no panel option here on purpose: `regress` ignores the firm identifier entirely.

NoteDeclaring the panel first with `xtset firm year` does not change `regress`. It only matters once you move to `xtreg`.

Scenario

Show why the same data needs a fixed-effects estimator instead.

Solution

Compare pooled OLS with within estimation: run `regress roa lev size`, then `xtreg roa lev size, fe`. If the `lev` coefficient shifts materially, the gap is evidence that $a_i$ was correlated with leverage and that pooled OLS was biased. The `fe` estimator removes $a_i$ by demeaning each firm, so it identifies the within-firm effect that pooled OLS could not.

NoteA large pooled-versus-FE gap is the practical signal that the entity effect mattered.

Common mistakes

✗Believing pooled OLS is always unbiased because OLS is unbiased in a single cross section. With panel data the relevant error is $v_{it}=a_i+u_{it}$ , and $a_i$ correlated with $x_{it}$ makes the estimator biased and inconsistent.
✗Thinking the only problem with pooled OLS is wrong standard errors. The deeper issue is bias in $\beta$ from the omitted $a_i$ . Serial correlation in $v_{it}$ is a second, separate problem about inference, not about the point estimate.
✗Assuming default `regress` standard errors are valid because the sample is large. Even with consistent slopes, $a_i$ persists across periods, so $v_{it}$ is serially correlated within an entity and the default SEs are wrong without clustering.
✗Treating pooled OLS and fixed effects as interchangeable when $T$ is small. They estimate the same $\beta$ only when $\text{Cov}(a_i,x_{it})=0$ . When that fails, only the within or first-difference estimators remove the bias.

Revision bullets

•Pooled OLS stacks all $NT$ observations and runs `regress y x`, ignoring the panel structure.
•The entity effect $a_i$ goes into the composite error $v_{it}=a_i+u_{it}$ .
•Consistent iff $\text{Cov}(a_i,x_{it})=0$ , otherwise biased like omitted-variable bias.
•Because $a_i$ persists across periods, $v_{it}$ is serially correlated within an entity, so default SEs are wrong.
•Fix the bias with fixed effects (demeaning), fix the inference with clustered standard errors.
•A large pooled-versus-FE coefficient gap signals that $a_i$ was correlated with the regressors.

Quick check

On a stacked firm-year panel you run `regress roa lev size`. Unobserved firm quality $a_i$ is positively correlated with leverage. The pooled OLS slope on `lev` is:

Even if pooled OLS happened to be consistent for $\beta$ , why are its default standard errors generally wrong on panel data?

Connected topics

Omitted var bias Panel Structure Fixed Effects Random Effects Clustered SEs

Sources

Wooldridge (2019), Ch. 13
Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.
Pooling cross sections and two-period panels: shows the composite error $a_i+u_{it}$ and why pooled OLS is inconsistent when the entity effect is correlated with the regressors.
Wooldridge (2019), Ch. 14
Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.
Advanced panel methods: contrasts pooled OLS with fixed-effects and random-effects estimation and discusses within-entity serial correlation and clustered standard errors.

How to cite this page

Dr. Phil's Quant Lab. (2026). Pooled OLS on Panel Data. Derivatives Atlas. https://phucnguyenvan.com/concept/efm-pooled-ols