Chapter 11: Further Issues in Time Series

Large Sample Properties and Highly Persistent Data

Why Do We Need Large Sample Properties?

The time series assumptions from Chapter 10 (like strict exogeneity) are often violated in the real world.

A Common Problem: Lagged Dependent Variables

Models like the AR(1) model must violate strict exogeneity:

y_t = β₀ + β₁y_t-1 + u_t

The error u_t affects y_t, which means u_t is correlated with future values of the explanatory variable (y_t, y_t+1, etc.). This violates TS.3.

When our assumptions for unbiasedness fail, we rely on asymptotic properties (consistency and asymptotic normality) which hold as the sample size grows.

Key Condition: Weak Dependence

For our large sample results to hold, a time series must be weakly dependent.

This means that observations in the distant past are not too strongly correlated with observations in the present. As the time between them grows, the correlation between them approaches zero.

Example: Stable AR(1)

When |ρ| < 1, the correlation between y_t and y_t-h is ρ^h, which goes to zero as h → ∞.

Counterexample: Random Walk

When ρ = 1, shocks are permanent and the series is highly persistent (not weakly dependent).

Asymptotic Properties of OLS

Under a new, weaker set of assumptions (TS.1' to TS.5') that require weak dependence and contemporaneous exogeneity (E(u_t|x_t)=0), we get our key results:

1. Consistency: The OLS estimators converge to the true population values as n gets large.

2. Asymptotic Normality: The OLS estimators are approximately normally distributed in large samples.

The Big Takeaway: We can still use our usual t-stats, F-stats, and confidence intervals for inference, but now they are only justified in large samples. We don't need strict exogeneity or normality!

Highly Persistent Series (Unit Roots)

Many economic time series (like asset prices, GDP, exchange rates) are not weakly dependent. They are better described as a random walk.

y_t = y_t-1 + e_t

A process like this is called integrated of order one, or I(1). Weakly dependent series are I(0).

A stable AR(1) process (blue) always returns to its mean. A random walk (red) does not; shocks have permanent effects.

The Danger: Spurious Regression

Regressing one I(1) time series on another unrelated I(1) time series will often produce a high R² and a statistically significant t-statistic, even though there is no real relationship between them.

This is called a spurious regression. The variables appear related only because they are both trending together by chance.

These two random walks were generated independently. But if you regressed one on the other, you'd likely find a "significant" relationship!

How to Handle I(1) Variables?

Given the danger of spurious regression, what is the fix when we suspect our variables have a unit root?

Answer: The First Difference

The first difference of an I(1) process is I(0) (weakly dependent). If y_t is a random walk, then:

Δy_t = y_t - y_t-1 = e_t

Since e_t is weakly dependent, we can safely use the first difference in our regressions. This is the standard way to avoid spurious regressions.

Chapter 11 Summary

Large sample properties are essential for most time series applications, where the strict classical assumptions are unlikely to hold.

For OLS asymptotics to hold, time series must be weakly dependent.
Models with lagged dependent variables violate strict exogeneity, but OLS is still consistent if the model is stable and the errors are contemporaneously exogenous.
Highly persistent (I(1)) time series, like random walks, are common and violate the weak dependence assumption.
Using I(1) variables in levels can lead to spurious regression.
The standard solution is to use the first difference of I(1) variables, which makes them weakly dependent (I(0)).