Basic Regression Analysis
Unlike cross-sectional data, time series data has a temporal ordering. The past can affect the future, but not the other way around.
We can use time series data to model different kinds of relationships.
These model a contemporaneous relationship, where a change in `z` at time `t` has an immediate effect on `y` at time `t`.
yt = β0 + β1zt + ut
These models allow a change in `z` to affect `y` over several future periods.
yt = α0 + δ0zt + δ1zt-1 + δ2zt-2 + ut
yt = α0 + δ0zt + δ1zt-1 + δ2zt-2 + ut
δ0 measures the immediate impact of a one-unit change in `z` on `y`.
The sum of the coefficients (δ0 + δ1 + δ2) measures the total, long-run change in `y` after a permanent one-unit change in `z`.
The lag distribution shows the effect of `z` over time. Here, the biggest effect happens after one period.
Because we don't have random sampling, we need to replace our old Gauss-Markov assumptions with new ones (TS.1 - TS.5).
The error `u`t at time `t` must be uncorrelated with the explanatory variables in every time period (past, present, and future). This is a very strong assumption and rules out "feedback" from `y` to future `x`'s.
The errors in different time periods must be uncorrelated with each other. Corr(ut, us) = 0 for t ≠ s. This is often violated in time series models.
If TS.1 - TS.5 hold, OLS is BLUE. If we add TS.6 (Normality), our usual t and F tests are exact.
Many economic time series grow over time. If two series are both trending, a regression of one on the other can produce a spurious relationship, even if they are unrelated.
The Solution: Include a time trend `t` (t=1, 2, 3, ...) as an independent variable in the regression. This "detrends" all the variables, so the coefficients measure the effect of `x` on `y` after accounting for the underlying trends.
For monthly or quarterly data, there are often systematic patterns within a year (e.g., retail sales are high in Q4). We can control for this using seasonal dummy variables.
You have quarterly data on ice cream sales. You want to control for seasonality. How many dummy variables should you include in your regression?
You have 4 seasons (categories). To avoid the dummy variable trap, you should include 3 dummy variables. For example, `Q1`, `Q2`, and `Q3`. The fourth quarter, Q4, would be the base group captured by the intercept.
Time series data presents new challenges, but OLS is still a powerful tool if we are careful with our assumptions.