Before discussing eigenvalues, eigenvectors and diagonalization,
we provide some motivations to them.
Example. (Formula of power of a diagonal matrix)
Let .
Then, for each positive integer ,
since
and we can prove the formula of power of a diagonal matrix by induction.
Example. Let
and .
Then, it can be computed that .
Let .
Then,
We can see from this example that for some special matrices, their powers can be computed conveniently, by expressing in the form of in which is invertible matrix and is diagonal matrix.
Naturally, given a matrix, we would be interested in knowing whether it can be expressed in the form of , and if it can, what are and , so that we can compute its power conveniently.
This is the main objective of this chapter.
In view of the motivation section, we have the following definition.
Definition. (Diagonalizable matrix)
A square matrix is diagonalizable if there exists an invertible matrix such that is a diagonal matrix.
Remark.
An equivalent condition is for some diagonal matrix and invertible matrix , which matches with the form in the motivation section.
So, if a matrix is diagonalizable, we can compute its power conveniently.
Example.
The matrix is diagonalizable, since
there exists such that is a diagonal matrix (which is ).
Also, there exists such that .
Exercise.
The following are important and general concepts, which is related to diagonalizability in some sense.
Definition. (Eigenvector and eigenvalue)
Let be a square matrix. A nonzero vector is an eigenvector of
if there exists a scalar such that .
In that case, is an eigenvalue of corresponding to eigenvector.
Remark.
means that multiplying the vector by the matrix is equivalent to multiplying it by a scalar (scaling of vector)
the prefix eigen- has the meanings of 'own', 'proper', and 'characteristic'
Example.
(Eigenvectors of identity matrix)
Each vector is an eigenvector of , since
for each vector ,
and their corresponding eigenvalues are all .
Exercise.
The following theorem relates diagonalizable matrix with eigenvectors and eigenvalues.
Theorem. (Diagonalization)
Let be an matrix. Then,
is diagonalizable if and only if has linearly independent eigenvectors.
If are linearly independent eigenvectors of
corresponding to the eigenvalues (some of them are possibly the same),
we can define an invertible matrix with
columns and a diagonal matrix
with diagonal entries such that
Proof. In the following, we use to denote the matrix with columns , in this order.
We have now proved that are eigenvectors. It remains to prove that they are linearly independent,
which is true since they are linearly independent if and only if is invertible by the proposition about relationship between invertibility and linear independence.
Remark.
we can put the eigenvectors into as columns in arbitrary orders, as long as we put the eigenvalues into the corresponding column of e.g. we may put into the 3rd column of , but we need to put into the 3rd column of
it follows that the expression for diagonalization is not unique, and there are actually infinitely many expressions
we have by definition of matrix multiplication. E.g.,
Then, we will introduce a convenient way to find eigenvalues.
Before this, we introduce a terminology which is related to this way of finding eigenvalues.
Definition.
(Characteristic polynomial)
Let be an matrix. The characteristic polynomial of in the variable is the polynomial .
Remark.
we can use arbitrary letter for the variable.
equivalently, the characteristic polynomial of is the determinant of with its diagonal entries subtracted by
Example.
The characteristic polynomial of is
Proposition.
(Equivalent condition for eigenvalue)
Let be an matrix. Then, is an eigenvalue of
if and only if , i.e. it is a root of the characteristic polynomial of .
Proof.
Then, we will introduce a concept which is related to eigenvector.
Exercise.
Definition.
(Eigenspace)
Let be an matrix. Suppose is an eigenvalue of .
Then, the null space, denoted by ,
is the eigenspace of corresponding to .
Remark.
since the null space is a subspace of , the eigensapce is also a subspace of
consists of the zero vector (since it is subspace) and all eigenvectors corresponding to since
After introducing these terminologies and concepts, we have the following algorithmic procedure for a diagonalization of an matrix:
compute all eigenvalues of by solving
for each eigenvalue of , find a basis for the eigenspace respectively
if together contain vectors (if not, is not diagonalizable), define
we have in which is a diagonal matrix whose diagonal entries are eigenvalues corresponding to
Remark.
it can be proved that eigenvectors of corresponding to distinct eigenvalues are linearly independent (its proof is skipped here)
so the columns of are linearly independent, thus is invertible
if has different eigenvalues, then is diagonalizable [1], since there are bases corresponding to the eigenvalues, which together contain vectors .
there are infinitely many possible bases for each eigenvalue, but we only need one of them
Example.
(Diagonalization of matrix)
Recall the example in the motivation section, it is given that the matrix is diagonalizable, and its expression in the form of is also given.
We will use the above procedure to derive the given expression.
First,
So, the eigenvalues of the matrix are and .
For the eigenvalue ,
since ,
and it can be proved that its general solution is ,
so a basis for is
For the eigenvalue ,
since ,
and it can be proved that its general solution is , so a basis for is
Then, we let
(since the two bases together contain two vectors),
and Then, we can compute that
Therefore, we have
which is the same as the given form in the example in motivation section.
In general, if we have ,
which is illustrated in the example in the motivation section.
From the example in motivation section,
Example.
(Diagonalization of matrix)
Consider the matrix
(It is not).
We would like to find a formula for .
First,
So, the eigenvalues of the matrix are and
.
For the eigenvalue ,
since
(There are two independent unknowns, so the dimension of each basis for the eigenspace is , i.e. there should be two vectors in each basis)
a basis for is .
For the eigenvalue , since
a basis for is .
Then, we let
,
(since the two bases together contain three vectors)
(we have two eigenvectors corresponding to the eigenvalue , so this eigenvalue is repeated two times).
Then, we can compute that .
It follows that
and
This is an interesting result.
Example.
(Complex eigenvalues)
Let [2]
Since the eigenvalues are all complex (and thus there does not exist any corresponding real eigenvector),
is not diagonalizable over real matrices.
On the other hand, is diagonalizable over complex matrices,
but diagonalization over complex matrices is not our main focus in this book, and we have not defined the operation of complex matrices.
So, an expression of in the form of is put in the following for reference:
Example.
(Non-diagonalizable matrix)
Consider the matrix (It is a nilpotent matrix, satisfying ).
First, since
the only eigenvalue is .
For eigenvalue , since
So, a basis for is .
Since it only contains one vector, while the size of the matrix is ,
is not diagonalizable.
Exercise.
In the following, we will discuss some mathematical applications of diagonalization, namely deducing the sequence formula, and solving system of ordinary differential equations (ODEs).
Example.
(Fibonacci sequence)
Consider the Fibonacci sequence, in which , and for each nonngeative integer .
For each nonnegative integer , this recurrence relation can be described as
Let .
Then,
To obtain the expression for , it suffices to find the formula for , and we may find it by diagonalization.
Since
Let be the golden ratio,
and be the conjugate of the golden ratio.
For eigenvalue , since for
we can transform the augmented matrix representing this SLE to RREF as follows:
[3]
So, the general solution is , and thus a basis for is .
For eigenvalue , since
and the RREF of the augmented matrix representing this SLE is
by symmetry[4].
So, the general solution is , and thus a basis for is .
Then, we let ,
.
We can compute that
Then, , and thus
Finally, we have
Thus,
in which and .
Exercise.
Define a sequence in which and for each nonnegative integer .
Example.
(System of ODEs)
Consider the system of ODEs
with initial conditions when .
Using the dot notation for differentiation, the system can be rewritten as
in which .
Suppose we can write in which is an invertible matrix and is a diagonal matrix.
Let in which are some real numbers.
Also, let , which implies and ,
and .
It follows that and .
Thus,
Let , then the system can be simplified to
in which are arbitrary constants,
and .
Then, we find by diagonalizing :
For eigenvalue ,
and the general solution is , so a basis for .
For eigenvalue ,
and the general solution is , so a basis for .
Then, we let and .
It follows that .
Then, .
It follows that and , and so
and .
Imposing the initial condition when ,
when , which implies and .
So,
Thus, the solution to this system of ODEs is
↑but even if has strictly less than eigenvalues, can still be diagonalizable. Actually, can at most have different eigenvalues, since the characteristic polynomial in is a degree polynomial in , which has roots (some of them may be repeated), by fundamental theorem of algebra
↑It is the matrix representation of the complex number .