Probability and Statistics Review
OLS rests on a few ideas from probability. The expected value is the population mean, the variance measures spread, and the covariance measures how two variables move together. We never see the population, so we estimate its parameters from a sample, and a different sample would give different estimates. This sampling variation is exactly what standard errors quantify, and the Central Limit Theorem explains why estimators become approximately normal in large samples.
Why it matters
There is a true mean out in the world (the population) and an estimate we compute from our data (the sample). The two are not the same, and the gap shrinks as the sample grows. The Central Limit Theorem is the quiet hero of the course. It says averages pile up into a bell curve even when the raw data do not, which is why t-tests and confidence intervals work at all.
Formulas
Worked examples
You draw 50 workers and compute a mean wage. A classmate draws a different 50 and gets a different mean. Which value is "right"?
Neither is the true population mean; both are estimates that vary by sample. The standard error reports the typical size of that variation. By the Central Limit Theorem, across many such samples the means would cluster in an approximately normal bell around the true mean, which is what makes interval estimates meaningful.
Common mistakes
- ✗The sample mean equals the population mean. The sample mean estimates the population mean but almost always differs from it; the difference is sampling error.
- ✗Zero covariance means two variables are unrelated. Zero covariance rules out a linear association but not a nonlinear one, since variables can be dependent yet have zero covariance.
- ✗The Central Limit Theorem requires the data themselves to be normal. The theorem delivers approximate normality of the average even when the underlying variable is far from normal, provided the sample is large.
- ✗A larger sample makes each observation more precise. A larger sample sharpens estimates of population parameters, not the individual data points, by shrinking the variance of the estimator.
Revision bullets
- •Expected value is the population mean; variance measures spread
- •Covariance measures linear co-movement of two variables
- •Sample statistics estimate population parameters with error
- •Sampling variation is what standard errors quantify
- •CLT: sample averages are approximately normal in large samples
Quick check
Two different random samples from the same population usually give different means because of
The Central Limit Theorem implies that, in large samples, the distribution of the sample mean is approximately
Connected topics
Sources
- Wooldridge (2019), App. B-CWooldridge, J. M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019. ISBN 978-1-337-55886-0.Appendices B and C review probability, expectation, and the Central Limit Theorem assumed throughout Chapters 2 to 5.