Econometric Theory/Serial Correlation

There are times, especially in time-series data, that the CLR assumption of $corr(\epsilon _{t},\epsilon _{t-1})=0$ is broken. This is known in econometrics as Serial Correlation or Autocorrelation. This means that $corr(\epsilon _{t},\epsilon _{t-1})\neq 0$ and there is a pattern across the error terms. The error terms are then not independently distributed across the observations and are not strictly random.

Examples of Autocorrelation

$corr(\epsilon _{t},\epsilon _{t-1})>0$

Positive Autocorrelation

$corr(\epsilon _{t},\epsilon _{t-1})<0$

Negative Autocorrelation

Functional Form

When the error term is related to the previous error term, it can be written in an algebraic equation. $\epsilon _{t}=\rho \epsilon _{t-1}+u_{t}$ where ρ is the autocorrelation coefficient between the two disturbance terms, and u is the disturbance term for the autocorrelation. This is known as an Autoregressive Process. $-1<\rho =corr(\epsilon _{t},\epsilon _{t-1})<1$ The u is needed within the equation because although the error term is less random, it still has a slight random effect.

Serial Correlation of the Nth Order

Autoregressive model

First order Autoregressive Process, AR(1): $\epsilon _{t}=\rho \epsilon _{t-1}+u_{t}$
- This is known as the first order autoregression, due to the error term only depending on the previous error term.
nth order Autoregressive Process, AR(n): $\epsilon _{t}=\rho _{1}\epsilon _{t-1}+\rho _{2}\epsilon _{t-2}+\cdots +\rho _{n}\epsilon _{t-n}+u_{t}$

Moving-average model

The notation MA(q) refers to the moving average model of order q:

X_{t}=\mu +\varepsilon _{t}+\sum _{i=1}^{q}\theta _{i}\varepsilon _{t-i}\,

where the θ₁, ..., θ_q are the parameters of the model, μ is the expectation of $X_{t}$ (often assumed to equal 0), and the $\varepsilon _{t}$ , $\varepsilon _{t-1}$ ,... are again, white noise error terms. The moving-average model is essentially a finite impulse response filter with some additional interpretation placed on it.

Autoregressive–moving-average model

The notation ARMA(p, q) refers to the model with p autoregressive terms and q moving-average terms. This model contains the AR(p) and MA(q) models,

X_{t}=c+\varepsilon _{t}+\sum _{i=1}^{p}\varphi _{i}X_{t-i}+\sum _{i=1}^{q}\theta _{i}\varepsilon _{t-i}.\,

Causes of Autocorrelation

Spatial Autocorrelation

$corr(\epsilon _{t},\epsilon _{t-1})\neq 0$ Spatial Autocorrelation occurs when the two errors are specially and/or geographically related. In simpler terms, they are "next to each." Examples: The city of St. Paul has a spike of crime and so they hire additional police. The following year, they found that the crime rate decreased significantly. Amazingly, the city of Minneapolis, which had not adjusted its police force, finds that they have an increase in the crime rate over the same period.

Note: this type of Autocorrelation occurs over cross-sectional samples.

Inertia/Time to Adjust
1. This often occurs in Macro, time series data. The US interest rate unexpectedly increases and so there is an associated change in exchange rates with other countries. Reaching a new equilibrium could take some time.
Prolonged Influences
1. This is again a Macro, time series issue dealing with economic shocks. It is now expected that the US interest rate will increase. ##The associated exchange rates will slowly adjust up-until the announcement by the Federal Reserve and may overshoot the equilibrium.
Data Smoothing/Manipulation
1. Using functions to smooth data will bring autocorrelation into the disturbance terms
Misspecification
1. A regression will often show signs of autocorrelation when there are omitted variables. Because the missing independent variable now exists in the disturbance term, we get a disturbance term that looks like: $\epsilon _{t}=\beta _{2}X_{2}+u_{t}$ when the correct specification is $Y_{t}=\beta _{0}+\beta _{1}X_{1}+\beta _{2}X_{2}+u_{t}$

Consequences of Autocorrelation

The main problem with autocorrelation is that it may make a model look better than it actually is.

List of consequences

Coefficients are still unbiased $E(\epsilon _{t})=0,cov(X_{t},u_{t})=0$
True variance of ${\hat {\beta }}$ is increased, by the presence of autocorrelations.
Estimated variance of ${\hat {\beta }}$ is smaller due to autocorrelation (biased downward).
A decrease in $se({\hat {\beta }})$ and an increase of the t-statistics; this results in the estimator looking more accurate than it actually is.
R² becomes inflated.

All of these problems result in hypothesis tests becoming invalid.

Autocorrelation in data. 2 runs, but the real OLS, which we would have never found, is somewhere in the middle.

Testing for Autocorrelation

While not conclusive, an impression can be gained by viewing a graph of the dependent variable against the error term (namely, a residual scatter-plot).
Durbin-Watson test:
1. Assume $\epsilon _{t}=\epsilon _{t-1}\rho +u_{t}$
2. Test H(0): ρ = 0 (no AC) against H(1): ρ > 0 (one-tailed test)
3. Test statistic $DW={\frac {\sum (\epsilon _{t}-\epsilon _{t-1})^{2}}{\sum \epsilon ^{2}}}=2-2\rho$

Any value under D(L) (in the D-W table) rejects the null hypothesis and AC exists.
Any value between D(L) and D(W) leaves us with no conclusion of AC.
Any value larger than D(W) accepts the null hypothesis and AC does not exist.

Note, this is one tail test. To get the other tail. Use 4 - DW as the test stat instead.