Skip to content
Simple Regressionintermediate

The Zero Conditional Mean Assumption

The zero conditional mean assumption states E(ux)=0E(u \mid x) = 0, that the average value of the unobserved factors is the same (zero) at every value of xx. This is the key identifying assumption that lets OLS recover the causal effect β1\beta_1 rather than a mere correlation, and it implies the population relationship E(yx)=β0+β1xE(y \mid x) = \beta_0 + \beta_1 x. It fails whenever an omitted factor in uu is correlated with xx, for example unobserved ability in a wage-on-education regression. When it fails, OLS is biased and inconsistent, and the slope no longer carries a ceteris paribus interpretation.

Why it matters

Imagine sorting people into groups by their value of xx, say years of schooling. The assumption says that within every schooling group, the average of all the other stuff in uu is the same, namely zero. If people with more schooling also tend to have higher ability, then uu is systematically larger in the high-xx group, the assumption breaks, and OLS credits ability’s effect to schooling. This is the single hinge on which causal interpretation turns, which is why it connects the population model to unbiasedness and to omitted variable bias.

Formulas

Zero conditional mean
E(ux)=0E(u \mid x) = 0
Stronger than zero correlation Cov(x,u)=0\text{Cov}(x, u) = 0. It requires the mean of uu to be zero at every value of xx, ruling out any systematic dependence.
Implied conditional mean of y
E(yx)=β0+β1xE(y \mid x) = \beta_0 + \beta_1 x
Under the assumption, the population regression function is exactly the line, so β1\beta_1 is the causal effect of xx on the average of yy.

Worked examples

Scenario

A researcher runs `regress wage educ` and wants to interpret the coefficient on `educ` as the causal return to schooling.

Solution

That causal reading is valid only if E(ueduc)=0E(u \mid \text{educ}) = 0, meaning unobserved factors such as innate ability, motivation, and family background do not vary systematically with education. In practice more able people tend to acquire more schooling, so uu and `educ` are correlated, the assumption fails, and the OLS coefficient overstates the true return. The estimate then mixes the schooling effect with the ability effect.

NoteThis identification problem motivates multiple regression controls and instrumental variables later in the course.

Common mistakes

  • E(ux)=0E(u \mid x) = 0 is the same as E(u)=0E(u) = 0. The unconditional mean E(u)=0E(u) = 0 is a harmless normalization absorbed by the intercept. The conditional version E(ux)=0E(u \mid x) = 0 is far stronger and is the assumption that does the identifying work.
  • Zero correlation between xx and uu is enough for causal interpretation. Cov(x,u)=0\text{Cov}(x, u) = 0 is weaker than E(ux)=0E(u \mid x) = 0. The conditional mean assumption rules out all forms of mean dependence, not just linear correlation, and is what Wooldridge invokes for unbiasedness.
  • You can test the assumption directly using the residuals. OLS forces xiu^i=0\sum x_i \hat{u}_i = 0 by construction, so the sample residuals are mechanically uncorrelated with xx. This tells you nothing about whether the population assumption holds.
  • If the assumption fails, the slope is meaningless. It is still a well-defined population quantity, the best linear predictor slope, but it no longer equals the causal effect. The failure changes the interpretation, not the existence, of β1\beta_1.

Revision bullets

  • Assumption is E(ux)=0E(u \mid x) = 0, the key identifying condition
  • Stronger than zero correlation Cov(x,u)=0\text{Cov}(x, u) = 0
  • Implies E(yx)=β0+β1xE(y \mid x) = \beta_0 + \beta_1 x, so β1\beta_1 is causal
  • Fails when an omitted factor in uu is correlated with xx (e.g. ability)
  • Cannot be checked with OLS residuals; xiu^i=0\sum x_i \hat{u}_i = 0 always holds

Quick check

The zero conditional mean assumption E(ux)=0E(u \mid x) = 0 requires that:

In `regress wage educ`, the assumption E(ueduc)=0E(u \mid \text{educ}) = 0 would most plausibly fail because:

Connected topics

Sources

  1. Wooldridge (2019), Ch. 2.5
    Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning, 2019. ISBN 978-1-337-55886-0.
    Section 2.5 introduces SLR.4, the zero conditional mean assumption, contrasts it with zero correlation, and explains its role in identifying the causal slope.
  2. Wooldridge (2019), §3.3 (omitted variables)
    Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning, 2019.
    Links the failure of the zero conditional mean assumption to omitted variable bias and the direction of that bias.
How to cite this page
Dr. Phil's Quant Lab. (2026). The Zero Conditional Mean Assumption. Derivatives Atlas. https://phucnguyenvan.com/concept/efm-zero-conditional-mean