Chapter 10: Time Series Data

Basic Regression Analysis

The Nature of Time Series Data

Unlike cross-sectional data, time series data has a temporal ordering. The past can affect the future, but not the other way around.

  • Each observation is not an independent draw from the population; they are draws from a stochastic process over time.
  • This means observations are often correlated with each other across time periods.
  • We can no longer assume random sampling (MLR.2). This has major consequences for our assumptions.

Types of Time Series Models

We can use time series data to model different kinds of relationships.

1. Static Models

These model a contemporaneous relationship, where a change in `z` at time `t` has an immediate effect on `y` at time `t`.

yt = β0 + β1zt + ut

2. Finite Distributed Lag (FDL) Models

These models allow a change in `z` to affect `y` over several future periods.

yt = α0 + δ0zt + δ1zt-1 + δ2zt-2 + ut

Interpreting FDL Models

yt = α0 + δ0zt + δ1zt-1 + δ2zt-2 + ut

Impact Propensity

δ0 measures the immediate impact of a one-unit change in `z` on `y`.

Long-Run Propensity (LRP)

The sum of the coefficients (δ0 + δ1 + δ2) measures the total, long-run change in `y` after a permanent one-unit change in `z`.

The lag distribution shows the effect of `z` over time. Here, the biggest effect happens after one period.

New Assumptions for Time Series

Because we don't have random sampling, we need to replace our old Gauss-Markov assumptions with new ones (TS.1 - TS.5).

TS.3: Strict Exogeneity

The error `u`t at time `t` must be uncorrelated with the explanatory variables in every time period (past, present, and future). This is a very strong assumption and rules out "feedback" from `y` to future `x`'s.

TS.5: No Serial Correlation

The errors in different time periods must be uncorrelated with each other. Corr(ut, us) = 0 for t ≠ s. This is often violated in time series models.

If TS.1 - TS.5 hold, OLS is BLUE. If we add TS.6 (Normality), our usual t and F tests are exact.

Handling Trends

Many economic time series grow over time. If two series are both trending, a regression of one on the other can produce a spurious relationship, even if they are unrelated.

The Solution: Include a time trend `t` (t=1, 2, 3, ...) as an independent variable in the regression. This "detrends" all the variables, so the coefficients measure the effect of `x` on `y` after accounting for the underlying trends.

Handling Seasonality

For monthly or quarterly data, there are often systematic patterns within a year (e.g., retail sales are high in Q4). We can control for this using seasonal dummy variables.

Check Your Understanding

You have quarterly data on ice cream sales. You want to control for seasonality. How many dummy variables should you include in your regression?

Answer:

You have 4 seasons (categories). To avoid the dummy variable trap, you should include 3 dummy variables. For example, `Q1`, `Q2`, and `Q3`. The fourth quarter, Q4, would be the base group captured by the intercept.

Chapter 10 Summary

Time series data presents new challenges, but OLS is still a powerful tool if we are careful with our assumptions.

  • Time series data are temporally ordered, and we can no longer assume random sampling.
  • New assumptions of strict exogeneity and no serial correlation are needed for the usual OLS properties to hold.
  • Finite Distributed Lag (FDL) models allow us to estimate the short-run and long-run effects of a variable.
  • Regressions with trending variables can be spurious. We can avoid this by including a time trend as a regressor.
  • Seasonal dummy variables should be included when using monthly or quarterly data with seasonal patterns.