Time Seriesadvanced

Spurious Regression and Cointegration

Regressing one $I(1)$ series on another, unrelated $I(1)$ series produces a spurious regression: a high $R^2$ and large $t$ statistics that are entirely misleading because the variables share no real link. This is the defining hazard of unit-root data and the reason a high $R^2$ between trending series proves nothing. The important exception is cointegration, when a linear combination of two $I(1)$ series is itself stationary ( $I(0)$ ), so the variables move together in the long run. Cointegrated variables admit an error-correction representation in which short-run changes adjust to close the gap from their long-run equilibrium.

Why it matters

Two random walks can drift in the same direction by chance for a long stretch, and OLS mistakes that coincidence for a strong relationship. Cointegration is the genuine version. Picture two drunks leaving a bar tied by a short rope. Each wanders unpredictably ( $I(1)$ ), yet the distance between them stays bounded ( $I(0)$ ). That bounded gap is the equilibrium relationship, and error correction is the tug on the rope that pulls them back whenever they drift too far apart.

Formulas

Cointegration condition

y_t - \beta x_t = u_t \ \text{is} \ I(0), \quad y_t, x_t \sim I(1)

Two I(1) series are cointegrated if the linear combination of them is stationary.

Error-correction model

\Delta y_t = \gamma_0 + \gamma_1 \Delta x_t + \lambda\,(y_{t-1} - \beta x_{t-1}) + e_t

A negative lambda pulls y back toward the long-run relationship; the bracket is the equilibrium error.

Worked examples

Scenario

Two unrelated $I(1)$ series give a regression with a very high $R^2$ and a huge t-statistic, and you must judge whether the link is real.

Solution

Be skeptical. With both series $I(1)$ this is the classic spurious-regression pattern. Test for cointegration by saving the residuals, `predict uhat, resid`, and applying a unit-root test to them with `dfuller uhat`. Only if the residuals are stationary is the relationship genuine cointegration rather than spurious.

NoteTesting the residuals for a unit root is the Engle-Granger two-step approach to detecting cointegration.

Scenario

Granger and Newbold (1974) ran the founding demonstration of this hazard. They simulated pairs of completely independent random walks and regressed one on the other, knowing the true relationship was zero. What did the regressions report?

Solution

Despite a zero true relationship, the regressions routinely produced a high $R^2$ and large, apparently significant $t$ statistics, purely because both series shared non-stationary trends that drifted together by chance. Their practical warning sign was simple, namely be suspicious whenever $R^2$ exceeds the Durbin-Watson statistic. The takeaway is that with non-stationary $I(1)$ series standard OLS inference is invalid, so you should difference the data or test for genuine cointegration before trusting the regression.

NoteThis Monte Carlo experiment is the origin of the spurious-regression literature and the rule of thumb that

R^2 > d

flags a likely spurious result.

Common mistakes

✗A high $R^2$ between two time series confirms a strong relationship. With $I(1)$ variables this is often spurious and tells you nothing about a true link.
✗Any two trending variables are cointegrated. Cointegration requires a stationary linear combination, which is special, not automatic.
✗Differencing cointegrated variables is the right way to model them. Differencing discards the long-run equilibrium; an error-correction model keeps both the short-run and long-run information.
✗The error-correction term can have a positive coefficient. The adjustment coefficient $\lambda$ must be negative so the system is pulled back toward equilibrium.

Revision bullets

•Regressing independent $I(1)$ series gives a spurious regression
•Spurious results show high $R^2$ and large $t$ but no real link
•Cointegration: a linear combination of $I(1)$ series is $I(0)$
•Cointegrated variables share a long-run equilibrium
•Error correction restores equilibrium with a negative adjustment coefficient

Quick check

Regressing one I(1) series on an independent I(1) series typically produces:

Two I(1) variables are cointegrated when:

Connected topics

R-squared Trends TS.1-TS.6 Serial Corr.Unit Roots

Sources

Wooldridge (2019), §18.4
Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 7th ed. Cengage, 2019.
Covers spurious regression with I(1) variables, cointegration, and error-correction models.
Engle & Granger (1987)
Engle, R.F., and C.W.J. Granger. Co-integration and Error Correction: Representation, Estimation, and Testing. Econometrica 55 (1987): 251-276.
Foundational paper defining cointegration and the two-step error-correction estimation procedure.
Granger & Newbold (1974)
Granger, C.W.J., and P. Newbold. Spurious Regressions in Econometrics. Journal of Econometrics 2, no. 2 (1974): 111-120.
The original Monte Carlo demonstration that regressing independent random walks yields high R-squared and significant t-statistics, with the R-squared greater than Durbin-Watson rule of thumb.

How to cite this page

Dr. Phil's Quant Lab. (2026). Spurious Regression and Cointegration. Derivatives Atlas. https://phucnguyenvan.com/concept/efm-spurious-cointegration

← Back to the atlas See in the network →