The Linear Probability Model
When the dependent variable is binary, OLS on that 0/1 outcome is the linear probability model (LPM). Because , each slope is the change in the probability of success for a one-unit change in a regressor. The LPM is easy to estimate and read, but it has two flaws: fitted values can fall outside , and the error is inherently heteroskedastic with variance . It is the entry point to limited dependent variable models such as logit and probit.
Why it matters
Sometimes the outcome is yes or no, such as in the labor force or not, or a loan approved or not. Coding it 0/1 and running OLS gives coefficients you can read straight off as "this raises the chance of a yes by so many percentage points," which is wonderfully concrete. The catch is that a straight line does not respect the 0-to-1 fence, so for extreme inputs it can predict a probability above one or below zero, which is nonsense and warns you the line is only a local approximation.
Formulas
Worked examples
Model whether a married woman is in the labor force as a function of education and number of young children.
Run `regress inlf educ kidslt6, robust`. A coefficient of about 0.038 on `educ` means each extra year of schooling raises the probability of being in the labor force by about 3.8 percentage points. The `robust` option corrects the standard errors for the built-in heteroskedasticity.
Estimate the probability that a mortgage application is approved given the applicant’s debt-to-income ratio.
Run `regress approve dti, robust`. The slope is the change in approval probability per unit of debt-to-income ratio. For very low or very high ratios the predicted probability can exceed one or drop below zero, a reminder that logit or probit may fit the tails better while the LPM still gives a clear average marginal effect.
Common mistakes
- ✗Trusting fitted probabilities outside . The linear form is not bounded, so out-of-range predictions occur and should not be read as real probabilities.
- ✗Using default OLS standard errors. The LPM error is heteroskedastic by construction, so robust standard errors are required for valid inference.
- ✗Believing the marginal effect is constant everywhere in a meaningful sense. The LPM imposes a constant effect, which is exactly where it can mislead near the boundaries; logit and probit let the effect taper.
- ✗Thinking logit or probit are always strictly better. The LPM often delivers very similar average partial effects and is simpler to interpret, which is why it remains widely used.
Revision bullets
- •LPM is OLS on a binary ; slopes are changes in .
- •Fitted values can fall outside .
- •Errors are heteroskedastic with variance , so use `robust`.
- •Coefficients read as changes in probability (percentage points).
- •Gateway to limited dependent variable models (logit, probit).
Quick check
In an LPM `regress inlf educ`, a coefficient of 0.04 on educ means an extra year of schooling:
Which is a genuine drawback of the LPM?
Connected topics
Sources
- Wooldridge (2019), §7.5Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.Defines the linear probability model, the probability interpretation of coefficients, and its limitations.
- Wooldridge (2019), §8.5Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.Shows the LPM error is heteroskedastic with variance and motivates robust inference.