Econometric Theory/Matrix Differentiation

There are a few simple rules for matrix differentiation. These allow much econometrics to be done in matrix form, which can be simpler and far less cumbersome than using nested summation signs.

Differentiating an inner product with respect to a vector

Let a be a given column vector and let x be a column choice vector (a vector of values to be chosen). Their transposes can be denoted as a' and x' . Then the derivative of their inner product, which is a scalar, is a column vector:

∂a'x/∂x = ∂x'a/∂x = a.

Differentiating a quadratic form with respect to a vector

Let A be a matrix, either symmetric or non-symmetric, and consider the quadratic form x'Ax, which is itself a scalar. The derivative of this quadratic form with respect to the vector x is the column vector

∂x'Ax/∂x = (A+A')x.

But in econometrics, almost always the matrix in the quadratic form will be symmetric. If A is indeed symmetric, the formula can be simplified to

∂x'Ax/∂x = 2Ax.

Application to Ordinary Least Squares

Perhaps the most basic concept in econometrics is ordinary least squares, in which we choose the regression coefficients so as to minimize the sum of squared residuals (mispredictions) of the regression. Suppose the regression model is

y = Xβ + e,

where y is an n×1 vector of observed values of the dependent variable, X is an n×k matrix in which each column is an n×1 vector of observed values of one of the k independent variables (k<n), β is the k×1 vector of estimated response coefficients to be chosen, and e is the n×1 vector of residuals.

The reader may confirm that the sum of squared residuals—the sum of the squares of the elements of e—is given by

e'e = (y - Xβ)'(y - Xβ) = y'y - y'Xβ - β'X'y + β'X'Xβ = y'y - 2β'(X'y) + β'(X'X)β

where the last equality holds because y'Xβ is a scalar and thus is its own transpose. Note that (X'y) is a k×1 vector and (X'X) is a k×k matrix. Using the above rules for differentiation, we have

∂e'e/∂β = -2X'y + 2(X'X)β,

where we have used the fact that X'X is symmetric. When we express the first-order condition by equating this vector derivative to the zero vector, we obtain

(X'X)β = X'y,

which can be solved for the choice vector β as

β = (X'X)^-1X'y.

This is the ordinary least squares estimator. Note that so long as X has full column rank, the expression (X'X)^-1 exists and (X'X) is positive definite, ensuring that the second-order conditions are satisfied.