Econometric Theory/Ordinary Least Squares (OLS)

Ordinary Least Squares or OLS is one of the simplest (if you can call it so) methods of linear regression. The goal of OLS is to closely "fit" a function with the data. It does so by minimizing the sum of squared errors from the data.

Why we Square Errors before Summing edit

We are not trying to minimize the sum of errors, but rather the sum of squared errors. Let's take a brief look at our sweater story again.

Model A

Model B

model	data point	error from line
A	1	5
A	2	10
A	3	-5
A	4	-10
B	1	3
B	2	-3
B	3	3
B	4	-3

Notice that the Sum of Model A is $5+10-5-10=0$ and that the Sum of Model B is $3-3+3-3=0$

The error of both Models sum to 0. Does this mean they are both are great fits! NO!!

So to account for the signs, whenever we sum errors, we square the terms first. Thus, both positive and negative deviations are penalized equally, while trying to minimize the errors of the fitted line.

The Model edit

These two models each have an intercept term $\alpha$ , and a slope term $\beta$ (some textbooks use $\beta _{0}$ instead of $\alpha$ and $\beta _{1}$ instead of $\beta$ , this is a much better approach once we move to multivariate formulas). We can represent an arbitrary single variable model with the formula: $y_{i}=\alpha +\beta x_{i}+u_{i}$ The y-values are related to the x-values given this formula. $y$ is called the dependent variable and $x$ is called the independent variable, since the value of $y_{i}$ is determined by the value of $x_{i}$ . We use the subscript i to denote an observation. So $y_{1}$ is paired with $x_{1}$ , $y_{2}$ with $x_{2}$ , etc. The $u_{t}$ term is the error term, which is the difference between the effect of $x_{i}$ and the observed value of $y_{i}$ .

Unfortunately, we don't know the values of $\alpha ,\beta$ or $u_{t}$ . We have to approximate them. We can do this by using the ordinary least squares method. The term "least squares" means that we are trying to minimize the sum of squares, or more specifically we are trying to minimize the squared error terms. Since there are two variables that we need to minimize with respect to ( $\alpha$ and $\beta$ ), we have two equations:
$f=\Sigma u_{i}^{2}=\Sigma (y_{i}-\alpha -\beta x_{i})^{2}$
${\frac {\partial f}{\partial \alpha }}=-2\Sigma (y_{i}-\alpha -\beta x_{i})=0$
${\frac {\partial f}{\partial \beta }}=-2\Sigma (y_{i}-\alpha -\beta x_{i})x_{i}=0$
Call the solutions to these equations ${\hat {\alpha }}$ and ${\hat {\beta }}$ . Solving we get:
${\hat {\alpha }}={\bar {y}}-{\hat {\beta }}{\bar {x}}$
${\hat {\beta }}={\frac {\Sigma (x_{i}-{\bar {x}})y_{i}}{\Sigma (x_{i}-{\bar {x}})^{2}}}$
Where ${\bar {y}}={\frac {\Sigma y_{i}}{n}}$ and ${\bar {x}}={\frac {\Sigma x_{i}}{n}}$ . Computing these results can be left as an exercise.

It is important to know that ${\hat {\alpha }}$ and ${\hat {\beta }}$ are not the same as $\alpha$ and $\beta$ because they are based on a single sample rather than the entire population. If you took a different sample, you would get different values for ${\hat {\alpha }}$ and ${\hat {\beta }}$ . Let's call ${\hat {\alpha }}$ and ${\hat {\beta }}$ the OLS estimators of $\alpha$ and $\beta$ . One of the main goals of econometrics is to analyze the quality of these estimators and see under what conditions these are good estimators and under which conditions they are not.

Once we have ${\hat {\alpha }}$ and ${\hat {\beta }}$ , we can construct two more variables. The first is the fitted values, or estimates of y:
${\hat {y}}_{i}={\hat {\alpha }}+{\hat {\beta }}x_{i}$
The second is the estimates of the error terms, which we will call the residuals:
${\hat {u}}_{i}=y_{i}-{\hat {y}}_{i}$
These two variables will be important later on.