Linear Algebra/Eigenvalues and eigenvectors

Eigenvalues and eigenvectors are related to fundamental properties of matrices.

The word eigenvalue comes from the German Eigenwert which means "proper or characteristic value."

Motivations

edit

Large matrices can be costly, in terms of computational time, to use, and may have to be iterated hundreds or thousands of times for a calculation. Additionally, the behavior of matrices would be hard to explore without important mathematical tools. One mathematical tool, which has applications not only for Linear Algebra but for differential equations, calculus, and many other areas, is the concept of eigenvalues and eigenvectors. Eigenvalues and eigenvectors are based upon a common behavior in linear systems. Let's look at an example.

Let

 

and

 

What happens with x and y if they are transformed by A? Well,

 
 

But what is remarkable is that

 
 

So when we operate on the vector x with the matrix A, instead of getting a different vector (as we would normally do), we get the same vector x multiplied by some constant. And the same goes for vector y.

We call the values 1 and -2 the eigenvalues of the matrix A, and the vectors x and y are called eigenvectors for the matrix A.

Definitions

edit

We now generalize this concept of when a matrix/vector product is the same as a product by a scalar as above: essentially if we have a n×n matrix A, we seek solutions in v to find the eigenvectors, and solutions in λ to find the eigenvalues for the equation

Avv

How are we to do this? Let us rearrange the equation

Avv=0
(A-λI)v=0 (note we must multiply the scalar by the identity matrix otherwise A-λ makes no sense)

But (A-λI) is a matrix, so we are trying to solve Bv=0 where B=(A-λI), and this solution is merely the kernel of B, ker B. So the eigenvectors are in ker (A-λI), where λ is an eigenvalue. But how do we find the eigenvalues?

Bv=0 has nonzero solution if |B| = det(B) is zero. So to find the eigenvalues, we let |A-λI|=0 and then solve for λ. We will thus obtain a polynomial equation over the complex numbers (eigenvalues can be complex), known as the characteristic equation. The roots of the characteristic equation are the eigenvalues.

Note that we exclude 0 as an eigenvector, because it is trivially a solution to Avv and is not really interesting to consider. Additionally, if the zero vector were to be included, it would allow for an infinite number of eigenvalues, since any value of λ satisfies A00.

If we have an eigenvalue λ of a matrix A, together with a corresponding eigenvector, x, then any multiple of x is also an eigenvector for the same eigenvalue. To see that kx is also an eigenvector, follow this argument: If Axx, then A(kx)=kAx=kλx=λ(kx). (Here k may be any scalar.) Thus, every multiple of an eigenvector is also an eigenvector.

Note the asymmetry here: eigenvalues are unique, while an eigenvalue has many eigenvectors. </gallery> </gallery> </gallery> Bold textÆə=== Finding eigenvalues and eigenvectors === Here are some examples of finding eigenvalues and eigenvectors using our definitions.

Let

 

Firstly, we expand |A-λI|=0 to find the eigenvalues:

 


 
 
 

Now, elementary algebra tells us the roots of this equation are 3 and 2, and thus these are our eigenvalues.

(Exercise: prove that in a 2×2 triangular matrix the eigenvalues are on the principal diagonal. Harder: generalize this result)

Now we can find our eigenvectors. Consider the first eigenvalue λ=3. To find our first eigenvector

 

At this point we can row-reduce and back-substitute, but usually it suffices to guess the kernel since our matrix is small and we have linearly dependent columns. Now, observe:

 

So, for any scalar a, the vector

  is an eigenvector. Stated another way, the set of all eigenvectors of the matrix A includes the set  . In the plane, this represents a line of slope -1 through the origin.

As noted above the eigenvalues of a matrix are uniquely determined, but for each eigenvalue there are many eigenvectors. We usually choose an eigenvector for some convenience such as "most whole number entries", "first entry is 1", or "length of the eigenvector is 1". Most Computer Algebra Systems choose unit vectors for eigenvectors.

So here we may take   to be the eigenvector, for example.

Similarly for our second eigenvalue λ=2, to find our second eigenvector:

 

And so, our second eigenvector is chosen as

 

Our eigenvalues then are λ=2,3, with eigenvectors  , as may be checked by multiplying each by the given matrix.

(We also could choose   as an eigenvector for the eigenvalue λ=3 . Check this.)

Problem set

edit

Given the above, find the eigenvalues and eigenvectors of the following matrices (Answers follow to even-numbered questions):

  1.  
  2.  
  3.  
(Harder. Hint: one eigenvalue is 4.)

Answers

edit
  1. eigenvalues: 3, 5; eigenvectors:  
  2. eigenvalues: -2, 2; eigenvectors:  
  3. eigenvalues: -3, 1, 4; eigenvectors:  

Applications

edit

Eigenvalues and eigenvectors are not mere pretty facts about these vectors; they have relevant and important applications.

Matrix powers

edit

Let us first examine a certain class of matrices known as diagonal matrices: these are matrices in the form

 

Now, observe that

 

This is a useful property! However, the number of matrices to which we can apply this fact is clearly limited, so we ask ourselves whether we can transform a given matrix into a diagonal matrix.

The answer to this question is "sometimes", but for the moment, we will only look at matrices for which this answer is "yes".

What we seek is a matrix P such that

PAP-1=D

where D is diagonal.

If such a matrix P exists, we say that A is diagonalizable. (Note that xyx-1 is often called a similarity transformation).

Then

PAP-1=D
AP-1=P-1D

by multiplying throughout forward by P-1, then

A=P-1DP

by multiplying backward by P.

Now, we have

Ak=(P-1DP)k
=(P-1DP)(P-1DP)(P-1DP)... (k times)
=P-1D(PP-1)D(PP-1)DP... (k times)

The PP-1 terms cancel to give

=P-1DDD...P (k times)
=P-1DkP

We can calculate Dk easily, so we need to find P.

It turns out (the entire proof is quite difficult) that we simply create a matrix from concatenating the linearly independent eigenvectors to create P.

D, then, is the diagonal matrix containing the eigenvalues on the main diagonal corresponding to the associated eigenvectors (the eigenvalue in the first place corresponds to the eigenvector it is created from, in the first column).

Example

edit

Let's work through an example to show these ideas.

 

So what do we do if we want to find A14? Let's use the method we've just described.

Find the eigenvalues:

|A-λI|=0
(3-λ)(-λ)-4=0
λ2-3λ-4=0
λ=-1, 4

Find the eigenvectors:

for λ=-1  
for λ=4  

The eigenvectors are then

 

so put the eigenvectors together to form the matrix P

 

Now -1 generated the eigenvector in the first column, and 4 generated the eigenvector in the second column, so form D in this way:

 

We can easily calculate (-1)14=1, so we get

 

and we have the fast method for creating inverses of 2×2 matrices:

 

So now we can now directly multiply out

 

Simplifying we get

 

Problem set

edit

Given the above, find the following matrix powers (Answers follow to even-numbered questions):

  1.  
  2.  
  3.  
  4.  
(More tedious: only slightly easier because this matrix is in row echelon form)
Answers
edit
2. 
4. 

Coupled ordinary differential equations

edit

We can use the method of diagonalisation to solve coupled ordinary differential equations. For example, let x(t) and y(t) be differentiable functions and x' and y' their derivatives. The differential equations are relatively difficult to solve:

x' = 4x - y
y' = 2x + y

but

u' = ku for a constant k is easy to solve

it has solution

u = Aekx where A is a constant

remembering this fact, we translate the ODEs into matrix form

 

Diagonalise the square matrix, we get:

 

we put

 

then it follows that

 

thus

 

as discussed above the solutions are easy. We have

 
 

for some constants C and D. Now that

 

we get

 

and so

 
 

This method generalises well into higher dimensions.

Coupled differential equations

edit

Matrices, strangely enough, have a great use in relation to calculus in the calculation of solutions to coupled differential equations, where one differential equation has some function that depends on another differential equation. For example:

D y = 3y + x
D x = y + 3x

Without going any further, the solution to these differential equations looks very difficult! However if we formulate this in terms of matrices, it becomes a little bit easier to analyze.

Example

edit

Let's take the above example, so

D y(t) = 3y + x
D x(t) = y + 3x

Now form a vector:

 

Then

 

Now the problem becomes

 

This is reminiscent of the differential equation we have already encountered in calculus, that of

D y = ky

in which the solution is y = cekt. We can make a wild guess then the solution to the above matrix equation will have a solution in a similar form.

So let's try a solution v = weλt. Then D v = λweλt.

Let us then try and substitute this guess solution into our equation:

 

If we let

 

we see that the equation above becomes, on dividing through by   (since it is never zero)

 

But wait - this is the equation before to find the eigenvalues - and we have that the solution v = weλt is a solution if and only if λ is an eigenvalue of A and w is its corresponding eigenvector.

The eigenvalues are 4, 2, with eigenvectors

 

respectively.

So we have two solutions

 

and

 

Note that if we have two solutions to the differential equation D v = Av, linear combinations of the two solutions will give the same solution. So then we have then the general solution:

 
 

Separating into the first and second components we get our two solutions

 .

Problem set

edit

Given the above solve the following problems (answers to even-numbered questions follow)

  1. Find y(t) and x(t) where D y(t)=3x(t)+6y(t) and D x(t)=x(t)+4y(t)
  2. Find y(t) and x(t) where D y(t)=2x(t)+2y(t) and D x(t)=x(t)-2y(t)
Answers
edit

Form the matrix

 

The eigenvalues of this matrix are

 

and the eigenvectors are

 

So now

 

and y(t) and x(t) can be read off by inspection.

edit