Simple Regressionintermediate

The Zero Conditional Mean Assumption

The zero conditional mean assumption states $E(u \mid x) = 0$ , that the average value of the unobserved factors is the same (zero) at every value of $x$ . This is the key identifying assumption that lets OLS recover the causal effect $\beta_1$ rather than a mere correlation, and it implies the population relationship $E(y \mid x) = \beta_0 + \beta_1 x$ . It fails whenever an omitted factor in $u$ is correlated with $x$ , for example unobserved ability in a wage-on-education regression. When it fails, OLS is biased and inconsistent, and the slope no longer carries a ceteris paribus interpretation.

Why it matters

Imagine sorting people into groups by their value of $x$ , say years of schooling. The assumption says that within every schooling group, the average of all the other stuff in $u$ is the same, namely zero. If people with more schooling also tend to have higher ability, then $u$ is systematically larger in the high- $x$ group, the assumption breaks, and OLS credits ability’s effect to schooling. This is the single hinge on which causal interpretation turns, which is why it connects the population model to unbiasedness and to omitted variable bias.

Formulas

Zero conditional mean

E(u \mid x) = 0

Stronger than zero correlation

\text{Cov}(x, u) = 0

. It requires the mean of

u

to be zero at every value of

x

, ruling out any systematic dependence.

Implied conditional mean of y

E(y \mid x) = \beta_0 + \beta_1 x

Under the assumption, the population regression function is exactly the line, so

\beta_1

is the causal effect of

x

on the average of

y

Worked examples

Scenario

A researcher runs `regress wage educ` and wants to interpret the coefficient on `educ` as the causal return to schooling.

Solution

That causal reading is valid only if $E(u \mid \text{educ}) = 0$ , meaning unobserved factors such as innate ability, motivation, and family background do not vary systematically with education. In practice more able people tend to acquire more schooling, so $u$ and `educ` are correlated, the assumption fails, and the OLS coefficient overstates the true return. The estimate then mixes the schooling effect with the ability effect.

NoteThis identification problem motivates multiple regression controls and instrumental variables later in the course.

Common mistakes

✗ $E(u \mid x) = 0$ is the same as $E(u) = 0$ . The unconditional mean $E(u) = 0$ is a harmless normalization absorbed by the intercept. The conditional version $E(u \mid x) = 0$ is far stronger and is the assumption that does the identifying work.
✗Zero correlation between $x$ and $u$ is enough for causal interpretation. $\text{Cov}(x, u) = 0$ is weaker than $E(u \mid x) = 0$ . The conditional mean assumption rules out all forms of mean dependence, not just linear correlation, and is what Wooldridge invokes for unbiasedness.
✗You can test the assumption directly using the residuals. OLS forces $\sum x_i \hat{u}_i = 0$ by construction, so the sample residuals are mechanically uncorrelated with $x$ . This tells you nothing about whether the population assumption holds.
✗If the assumption fails, the slope is meaningless. It is still a well-defined population quantity, the best linear predictor slope, but it no longer equals the causal effect. The failure changes the interpretation, not the existence, of $\beta_1$ .

Revision bullets

•Assumption is $E(u \mid x) = 0$ , the key identifying condition
•Stronger than zero correlation $\text{Cov}(x, u) = 0$
•Implies $E(y \mid x) = \beta_0 + \beta_1 x$ , so $\beta_1$ is causal
•Fails when an omitted factor in $u$ is correlated with $x$ (e.g. ability)
•Cannot be checked with OLS residuals; $\sum x_i \hat{u}_i = 0$ always holds

Quick check

The zero conditional mean assumption $E(u \mid x) = 0$ requires that:

In `regress wage educ`, the assumption $E(u \mid \text{educ}) = 0$ would most plausibly fail because:

Connected topics

Population Model Cause vs Corr SLR Model Unbiasedness Omitted var bias MLR assumptions

Sources

Wooldridge (2019), Ch. 2.5
Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning, 2019. ISBN 978-1-337-55886-0.
Section 2.5 introduces SLR.4, the zero conditional mean assumption, contrasts it with zero correlation, and explains its role in identifying the causal slope.
Wooldridge (2019), §3.3 (omitted variables)
Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning, 2019.
Links the failure of the zero conditional mean assumption to omitted variable bias and the direction of that bias.

How to cite this page

Dr. Phil's Quant Lab. (2026). The Zero Conditional Mean Assumption. Derivatives Atlas. https://phucnguyenvan.com/concept/efm-zero-conditional-mean

← Back to the atlas See in the network →