Introductory Linear Algebra/Eigenvalues and eigenvectors

Introductory Linear Algebra
Eigenvalues and eigenvectors

Motivation

Before discussing eigenvalues, eigenvectors and diagonalization, we provide some motivations to them.

Example. (Formula of power of a diagonal matrix) Let $D={\begin{pmatrix}3&0\\0&-5\end{pmatrix}}$ . Then, $D={\begin{pmatrix}3^{n}&0\\0&(-5)^{n}\end{pmatrix}}$ for each positive integer $n$ , since ${\begin{pmatrix}3&0\\0&-5\end{pmatrix}}{\begin{pmatrix}3^{k}&0\\0&(-5)^{k}\end{pmatrix}}={\begin{pmatrix}3^{k+1}&0\\0&(-5)^{k+1}\end{pmatrix}},$ and we can prove the formula of power of a diagonal matrix by induction.

Example. Let $P={\begin{pmatrix}1&1\\2&3\\\end{pmatrix}}$ and $D={\begin{pmatrix}3&0\\0&-5\end{pmatrix}}$ . Then, it can be computed that $P^{-1}={\begin{pmatrix}3&-1\\-2&1\\\end{pmatrix}}$ . Let $A=PDP^{-1}={\begin{pmatrix}19&-8\\48&-21\\\end{pmatrix}}$ . Then, ${\begin{aligned}A^{n}&=(PDP^{-1})^{n}=\underbrace {(PD{\color {blue}P^{-1}})({\color {blue}P}DP^{-1})\cdots (PD{\color {brown}P^{-1}})({\color {brown}P}DP^{-1})} _{n\;PDP^{-1}{\text{'s}}}\\&=PD(\underbrace {\color {blue}P^{-1}P} _{\color {blue}I})DP^{-1}\cdots PD(\underbrace {\color {brown}P^{-1}P} _{\color {brown}I})DP^{-1}\\&=PD\underbrace {{\color {blue}I}D\cdots {\color {brown}I}D} _{n-1\;ID{\text{'s}}}P^{-1}\\&=P\underbrace {DD\cdots D} _{n\;D{\text{'s}}}P^{-1}\\&=PD^{n}P^{-1}\\&=P{\begin{pmatrix}3^{n}&0\\0&(-5)^{n}\end{pmatrix}}P^{-1}\qquad {\text{by above example}}\\&={\begin{pmatrix}1&1\\2&3\\\end{pmatrix}}{\begin{pmatrix}3^{n}&0\\0&(-5)^{n}\end{pmatrix}}{\begin{pmatrix}3&-1\\-2&1\\\end{pmatrix}}\\&={\begin{pmatrix}3^{n+1}-2(-5)^{n}&(-5)^{n}-3^{n}\\6(3^{n})-6(-5)^{n}&3(-5)^{n}-2(3^{n})\\\end{pmatrix}}\end{aligned}}$

We can see from this example that for some special matrices, their powers can be computed conveniently, by expressing in the form of $PDP^{-1}$ in which $P$ is invertible matrix and $D$ is diagonal matrix.

Naturally, given a matrix, we would be interested in knowing whether it can be expressed in the form of $PDP^{-1}$ , and if it can, what are $P$ and $D$ , so that we can compute its power conveniently. This is the main objective of this chapter.

Eigenvalues, eigenvectors and diagonalization

In view of the motivation section, we have the following definition.

Definition. (Diagonalizable matrix) A square matrix $A$ is diagonalizable if there exists an invertible matrix $P$ such that $P^{-1}AP$ is a diagonal matrix.

Remark. An equivalent condition is $A=PDP^{-1}$ for some diagonal matrix $D$ and invertible matrix $P$ , which matches with the form in the motivation section. So, if a matrix is diagonalizable, we can compute its power conveniently.

Example. The matrix $I_{n}$ is diagonalizable, since there exists $P=I_{n}$ such that $P^{-1}I_{n}P$ is a diagonal matrix (which is $I_{n}$ ). Also, there exists $P=I_{n},D=I_{n}$ such that $I_{n}=PDP^{-1}$ .

Exercise.

The following are important and general concepts, which is related to diagonalizability in some sense.

Definition. (Eigenvector and eigenvalue) Let $A$ be a square matrix. A nonzero vector $\mathbf {v}$ is an eigenvector of $A$ if there exists a scalar $\lambda$ such that $A\mathbf {v} =\lambda \mathbf {v}$ . In that case, $\lambda$ is an eigenvalue of $A$ corresponding to eigenvector $\mathbf {v}$ .

Remark.

$A\mathbf {v} =\lambda \mathbf {v}$ means that multiplying the vector $\mathbf {v}$ by the matrix $A$ is equivalent to multiplying it by a scalar (scaling of vector)
the prefix eigen- has the meanings of 'own', 'proper', and 'characteristic'

Example. (Eigenvectors of identity matrix) Each vector $\mathbf {v} \in \mathbb {R} ^{n}$ is an eigenvector of $I_{n}$ , since $I_{n}\mathbf {v} =\mathbf {v} =1\cdot \mathbf {v} ,$ for each vector $\mathbf {v} \in \mathbb {R} ^{n}$ , and their corresponding eigenvalues are all $1$ .

Exercise.

	If $\mathbf {v}$ is an eigenvector of invertible matrix $A$ , then it is also an eigenvector of $A^{-1}$ .
	If $\lambda$ is an eigenvalue of $A$ , then $\lambda ^{n}$ is an eigenvalue of $A^{n}$ .
	Each vector $\mathbf {v} \in \mathbb {R} ^{n}$ is an eigenvctor of the zero matrix $O_{n\times n}$ .
	Zero vector is an eigenvector of each square matrix.
	If there exists an eigenvector for a matrix, then there are infinitely many eigenvectors of that matrix.

The following theorem relates diagonalizable matrix with eigenvectors and eigenvalues.

Theorem. (Diagonalization) Let $A$ be an $n\times n$ matrix. Then, $A$ is diagonalizable if and only if $A$ has $n$ linearly independent eigenvectors. If $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$ are linearly independent eigenvectors of $A$ corresponding to the eigenvalues $\lambda _{1},\ldots ,\lambda _{n}$ (some of them are possibly the same), we can define an invertible matrix $P$ with columns $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$ and a diagonal matrix $D$ with diagonal entries $\lambda _{1},\ldots ,\lambda _{n}$ such that $A=PDP^{-1}.$

Proof. In the following, we use ${\begin{pmatrix}\mathbf {v} _{1}&\cdots &\mathbf {v} _{n}\end{pmatrix}}$ to denote the matrix with columns $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$ , in this order. ${\begin{aligned}&&A&=PDP^{-1}\\&\Leftrightarrow &AP&=PD\underbrace {PP^{-1}} _{I}\\&\Leftrightarrow &A{\begin{pmatrix}\mathbf {v} _{1}&\cdots &\mathbf {v} _{n}\end{pmatrix}}&={\begin{pmatrix}\mathbf {v} _{1}&\cdots &\mathbf {v} _{n}\end{pmatrix}}{\begin{pmatrix}\lambda _{1}&0&\cdots &0\\0&\lambda _{2}&\cdots &0\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &\lambda _{n}\end{pmatrix}}\\&\Leftrightarrow &{\begin{pmatrix}A\mathbf {v} _{1}&\cdots &A\mathbf {v} _{n}\end{pmatrix}}&={\begin{pmatrix}\lambda _{1}\mathbf {v} _{1}&\cdots &\lambda _{n}\mathbf {v} _{n}\end{pmatrix}}\\&\Leftrightarrow &A\mathbf {v} _{1}&=\lambda _{1}\mathbf {v} _{1},\ldots ,A\mathbf {v} _{n}=\lambda _{n}\mathbf {v} _{n}.\end{aligned}}$ We have now proved that $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$ are eigenvectors. It remains to prove that they are linearly independent, which is true since they are linearly independent if and only if $P$ is invertible by the proposition about relationship between invertibility and linear independence. $\Box$

Remark.

we can put the eigenvectors into $P$ as columns in arbitrary orders, as long as we put the eigenvalues into the corresponding column of $D$ e.g. we may put $\mathbf {v} _{1}$ into the 3rd column of $P$ , but we need to put $\lambda _{1}$ into the 3rd column of $D$

it follows that the expression for diagonalization is not unique, and there are actually infinitely many expressions

we have $A{\begin{pmatrix}\mathbf {v} _{1}&\cdots &\mathbf {v} _{n}\end{pmatrix}}={\begin{pmatrix}A\mathbf {v} _{1}&\cdots &A\mathbf {v} _{n}\end{pmatrix}},$ by definition of matrix multiplication. E.g., $A{\begin{pmatrix}\mathbf {u} &\mathbf {w} \end{pmatrix}}:={\begin{pmatrix}1&2\\3&4\\\end{pmatrix}}{\begin{pmatrix}u_{1}&w_{1}\\u_{2}&w_{2}\end{pmatrix}}={\begin{pmatrix}u_{1}+2u_{2}&w_{1}+2w_{2}\\3u_{1}+4u_{2}&3w_{1}+4w_{2}\\\end{pmatrix}},A\mathbf {u} ={\begin{pmatrix}u_{1}+2u_{2}\\3u_{1}+4u_{2}\\\end{pmatrix}},A\mathbf {w} ={\begin{pmatrix}w_{1}+2w_{2}\\3w_{1}+4w_{2}\\\end{pmatrix}}$

Then, we will introduce a convenient way to find eigenvalues. Before this, we introduce a terminology which is related to this way of finding eigenvalues.

Definition. (Characteristic polynomial) Let $A$ be an $n\times n$ matrix. The characteristic polynomial of $A$ in the variable $t$ is the polynomial $\det(A-tI_{n})$ .

Remark.

we can use arbitrary letter for the variable.
equivalently, the characteristic polynomial of $A$ is the determinant of $A$ with its diagonal entries subtracted by $t$

Example. The characteristic polynomial of ${\begin{pmatrix}1&2&4\\4&5&2\\0&0&9\\\end{pmatrix}}$ is ${\begin{vmatrix}1-t&2&4\\4&5-t&2\\0&0&9-t\\\end{vmatrix}}=(9-t){\begin{vmatrix}1-t&2\\4&5-t\\\end{vmatrix}}=(9-t)[(1-t)(5-t)-2(4)]=-x^{3}+15x^{2}-51x-27.$

Proposition. (Equivalent condition for eigenvalue) Let $A$ be an $n\times n$ matrix. Then, $\lambda$ is an eigenvalue of $A$ if and only if $\det(A-\lambda I_{n})=0$ , i.e. it is a root of the characteristic polynomial of $A$ .

Proof. ${\begin{aligned}&&\lambda {\text{ is an }}&{\text{eigenvalue of }}A\\&\Leftrightarrow &A\mathbf {v} &=\lambda \mathbf {v} \qquad {\text{for some }}\mathbf {v} \neq \mathbf {0} \\&\Leftrightarrow &(A-\lambda I_{n})\mathbf {v} &=\mathbf {0} \qquad {\text{for some }}\mathbf {v} \neq \mathbf {0} \\&\Leftrightarrow &A-\lambda I_{n}&{\text{ is non-invertible}}\qquad {\text{by simplified invertible matrix theorem}}\\&\Leftrightarrow &\det(A-\lambda I_{n})&=0\end{aligned}}$

$\Box$

Then, we will introduce a concept which is related to eigenvector.

Exercise.

	A square matrix has $n$ different eigenvalues if its characteristic polynomial has $n$ roots.
	If $A$ with size $2\times 2$ has two linearly independent eigenvectors $(3,1)^{T},(0,2)^{T}$ corresponding to the eigenvalues $4,5$ , then we can define an invertible matrix $P={\begin{pmatrix}3&1\\0&2\\\end{pmatrix}}$ and diagonal matrix $D={\begin{pmatrix}4&0\\0&5\end{pmatrix}}$ such that $A=P^{-1}DP$ .
	If $A$ with size $2\times 2$ has two linearly independent eigenvectors, then we can define an invertible matrix $P$ and a diagonal matrix $D$ such that $A=P^{-1}DP$ .

Definition. (Eigenspace) Let $A$ be an $n\times n$ matrix. Suppose $\lambda$ is an eigenvalue of $A$ . Then, the null space $\operatorname {Null} (A-\lambda I_{n})$ , denoted by $E_{\lambda }$ , is the eigenspace of $A$ corresponding to $\lambda$ .

Remark.

since the null space is a subspace of $\mathbb {R} ^{n}$ , the eigensapce is also a subspace of $\mathbb {R} ^{n}$
$E_{\lambda }$ consists of the zero vector (since it is subspace) and all eigenvectors corresponding to $\lambda$ since

$A\mathbf {v} =\lambda \mathbf {v} \iff (A-\lambda I_{n})\mathbf {v} =\mathbf {0} \iff \mathbf {v} \in \operatorname {Null} (A-\lambda I_{n})=E_{\lambda }\quad (\mathbf {v} \neq \mathbf {0} {\text{ by definition}})$

After introducing these terminologies and concepts, we have the following algorithmic procedure for a diagonalization of an $n\times n$ matrix:

compute all eigenvalues of $A$ by solving $\det(A-\lambda I)=0$
for each eigenvalue $\lambda _{1},\ldots ,\lambda _{k}$ of $A$ , find a basis $\beta _{1},\ldots ,\beta _{k}$ for the eigenspace $E_{\lambda _{1}},\ldots ,E_{\lambda _{k}}$ respectively
if $\beta _{1},\ldots ,\beta _{k}$ together contain $n$ vectors $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$ (if not, $A$ is not diagonalizable), define $P={\begin{pmatrix}\mathbf {v} _{1}&\cdots &\mathbf {v} _{n}\end{pmatrix}}$
we have $A=PDP^{-1}$ in which $D$ is a diagonal matrix whose diagonal entries are eigenvalues corresponding to $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$

Remark.

it can be proved that eigenvectors of $A$ corresponding to distinct eigenvalues are linearly independent (its proof is skipped here)

so the columns of $P$ are linearly independent, thus $P$ is invertible

if $A$ has $n$ different eigenvalues, then $A$ is diagonalizable ^[1], since there are $n$ bases corresponding to the $n$ eigenvalues, which together contain $n$ vectors $\mathbf {v} _{1},\ldots ,\mathbf {v} _{n}$ .
there are infinitely many possible bases for each eigenvalue, but we only need one of them

Example. (Diagonalization of $2\times 2$ matrix) Recall the example in the motivation section, it is given that the matrix $A={\begin{pmatrix}19&-8\\48&-21\\\end{pmatrix}}$ is diagonalizable, and its expression in the form of $PDP^{-1}$ is also given. We will use the above procedure to derive the given expression.

First, ${\begin{vmatrix}19-\lambda &-8\\48&-21-\lambda \end{vmatrix}}=0\iff (19-\lambda )(-21-\lambda )+8(48)=0\implies \lambda ^{2}+2\lambda -15=0\iff \lambda =3{\text{ or }}\lambda =-5.$ So, the eigenvalues of the matrix are $\lambda =3$ and $\lambda =-5$ .

For the eigenvalue $\lambda ={\color {green}3}$ , since $(A-3I)\mathbf {x} =\mathbf {0} \implies {\begin{pmatrix}16&-8\\48&-24\\\end{pmatrix}}\mathbf {x} =\mathbf {0}$ , and it can be proved that its general solution is $\mathbf {x} =(a,2a)^{T}$ , so a basis for $E_{3}$ is $\{{\color {green}(1,2)^{T}}\}$

For the eigenvalue $\lambda ={\color {blue}-5}$ , since $(A+5I)\mathbf {x} =\mathbf {0} \implies {\begin{pmatrix}24&-8\\48&-16\\\end{pmatrix}}\mathbf {x} =\mathbf {0}$ , and it can be proved that its general solution is $\mathbf {x} =(b,3b)^{T}$ , so a basis for $E_{-5}$ is $\{{\color {blue}(1,3)^{T}}\}$

Then, we let $P={\begin{pmatrix}{\color {green}1}&{\color {blue}1}\\{\color {green}2}&{\color {blue}3}\\\end{pmatrix}}$ (since the two bases together contain two vectors), and $D={\begin{pmatrix}{\color {green}3}&0\\0&{\color {blue}-5}\end{pmatrix}}.$ Then, we can compute that $P^{-1}={\begin{pmatrix}3&-1\\-2&1\\\end{pmatrix}}.$ Therefore, we have $A=PDP^{-1}={\begin{pmatrix}{\color {green}1}&{\color {blue}1}\\{\color {green}2}&{\color {blue}3}\\\end{pmatrix}}{\begin{pmatrix}{\color {green}3}&0\\0&{\color {blue}-5}\end{pmatrix}}{\begin{pmatrix}3&-1\\-2&1\\\end{pmatrix}},$ which is the same as the given form in the example in motivation section. In general, if we have $A=PDP^{-1}$ , $A^{n}=PD^{n}P^{-1},$ which is illustrated in the example in the motivation section. From the example in motivation section, $A^{n}={\begin{pmatrix}3^{n+1}-2(-5)^{n}&(-5)^{n}-3^{n}\\6(3^{n})-6(-5)^{n}&3(-5)^{n}-2(3^{n})\\\end{pmatrix}}.$

Example. (Diagonalization of $3\times 3$ matrix) Consider the matrix $A={\begin{pmatrix}0&0&1\\0&1&0\\1&0&0\\\end{pmatrix}}$ (It is not $I_{3}$ ). We would like to find a formula for $A^{n}$ . First, ${\begin{vmatrix}-\lambda &0&1\\0&1-\lambda &0\\1&0&-\lambda \end{vmatrix}}=0\implies \lambda ^{2}(1-\lambda )-(1-\lambda )=0\implies (1-\lambda )(\lambda ^{2}-1)=0\implies \lambda =1{\text{(repeated) or }}\lambda =-1.$ So, the eigenvalues of the matrix are $\lambda =1$ and $\lambda =-1$ .

For the eigenvalue $\lambda ={\color {green}1}$ , since $(A-I)\mathbf {x} =\mathbf {0} \implies {\begin{pmatrix}-1&0&1\\0&0&0\\1&0&-1\end{pmatrix}}\mathbf {x} =\mathbf {0} \implies \mathbf {x} =(b,a,b)^{T}=b(1,0,1)^{T}+a(0,1,0)^{T},$ (There are two independent unknowns, so the dimension of each basis for the eigenspace is $2$ , i.e. there should be two vectors in each basis) a basis for $E_{1}$ is $\{{\color {green}(1,0,1)^{T}},{\color {green}(0,1,0)^{T}}\}$ .

For the eigenvalue $\lambda ={\color {blue}-1}$ , since $(A+I)\mathbf {x} =\mathbf {0} \implies {\begin{pmatrix}1&0&1\\0&2&0\\1&0&1\end{pmatrix}}\mathbf {x} =\mathbf {0} \implies \mathbf {x} =(c,0,-c)^{T},$ a basis for $E_{-1}$ is $\{{\color {blue}(1,0,-1)^{T}}\}$ .

Then, we let $P={\begin{pmatrix}{\color {green}1}&{\color {green}0}&{\color {blue}1}\\{\color {green}0}&{\color {green}1}&{\color {blue}0}\\{\color {green}1}&{\color {green}0}&{\color {blue}-1}\end{pmatrix}}$ , (since the two bases together contain three vectors) $D={\begin{pmatrix}{\color {green}1}&0&0\\0&{\color {green}1}&0\\0&0&{\color {blue}-1}\end{pmatrix}}$ (we have two eigenvectors corresponding to the eigenvalue $\lambda =1$ , so this eigenvalue is repeated two times). Then, we can compute that $P^{-1}={\begin{pmatrix}{\frac {1}{2}}&0&{\frac {1}{2}}\\0&1&0\\{\frac {1}{2}}&0&-{\frac {1}{2}}\end{pmatrix}}$ . It follows that $A=PDP^{-1}={\begin{pmatrix}{\color {green}1}&{\color {green}0}&{\color {blue}1}\\{\color {green}0}&{\color {green}1}&{\color {blue}0}\\{\color {green}1}&{\color {green}0}&{\color {blue}-1}\end{pmatrix}}{\begin{pmatrix}{\color {green}1}&0&0\\0&{\color {green}1}&0\\0&0&{\color {blue}-1}\end{pmatrix}}{\begin{pmatrix}{\frac {1}{2}}&0&{\frac {1}{2}}\\0&1&0\\{\frac {1}{2}}&0&-{\frac {1}{2}}\end{pmatrix}},$ and $A^{n}=PD^{n}P^{-1}={\begin{pmatrix}{\color {green}1}&{\color {green}0}&{\color {blue}1}\\{\color {green}0}&{\color {green}1}&{\color {blue}0}\\{\color {green}1}&{\color {green}0}&{\color {blue}-1}\end{pmatrix}}{\begin{pmatrix}{\color {green}1}^{n}&0&0\\0&{\color {green}1}^{n}&0\\0&0&({\color {blue}-1})^{n}\end{pmatrix}}{\begin{pmatrix}{\frac {1}{2}}&0&{\frac {1}{2}}\\0&1&0\\{\frac {1}{2}}&0&-{\frac {1}{2}}\end{pmatrix}}={\begin{pmatrix}{\frac {1+(-1)^{n}}{2}}&0&{\frac {1-(-1)^{n}}{2}}\\0&1&0\\{\frac {1-(-1)^{n}}{2}}&0&{\frac {1+(-1)^{n}}{2}}\\\end{pmatrix}}={\begin{cases}I_{3}&{\text{if }}n{\text{ is even}}\\A&{\text{if }}n{\text{ is odd}}\end{cases}}.$ This is an interesting result.

Example. (Complex eigenvalues) Let $A={\begin{pmatrix}1&2\\-2&1\\\end{pmatrix}}.$ ^[2] ${\begin{vmatrix}1-\lambda &2\\-2&1-\lambda \end{vmatrix}}=0\implies (1-\lambda )^{2}+4=0\implies (1-\lambda )^{2}=-4\implies 1-\lambda =\pm 2i\implies \lambda =1\mp 2i.$ Since the eigenvalues are all complex (and thus there does not exist any corresponding real eigenvector), $A$ is not diagonalizable over real matrices. On the other hand, $A$ is diagonalizable over complex matrices, but diagonalization over complex matrices is not our main focus in this book, and we have not defined the operation of complex matrices. So, an expression of $A$ in the form of $PDP^{-1}$ is put in the following for reference: $A={\begin{pmatrix}i&-i\\1&1\\\end{pmatrix}}{\begin{pmatrix}1-2i&0\\0&1+2i\end{pmatrix}}{\begin{pmatrix}i&-i\\1&1\end{pmatrix}}^{-1}.$

Example. (Non-diagonalizable matrix) Consider the matrix $N={\begin{pmatrix}0&1\\0&0\\\end{pmatrix}}$ (It is a nilpotent matrix, satisfying $N^{2}=O$ ).

First, since ${\begin{vmatrix}-\lambda &1\\0&-\lambda \end{vmatrix}}=0\implies \lambda ^{2}=0\implies \lambda =0,$ the only eigenvalue is $\lambda =0$ .

For eigenvalue $\lambda =0$ , since $(N+0I)\mathbf {x} =\mathbf {0} \implies N\mathbf {x} =\mathbf {0} \implies {\begin{pmatrix}0&1\\0&0\\\end{pmatrix}}\mathbf {x} =\mathbf {0} \implies \mathbf {x} =(a,0)^{T}$ So, a basis for $E_{0}$ is $\{(1,0)^{T}\}$ . Since it only contains one vector, while the size of the matrix is $2\times 2$ , $N$ is not diagonalizable.

Exercise.

1 Compute ${\begin{pmatrix}0&2\\3&0\\\end{pmatrix}}^{10}$ .

	${\begin{pmatrix}1024&0\\0&59049\\\end{pmatrix}}$ .
	${\begin{pmatrix}7776&0\\0&7776\end{pmatrix}}$ .
	${\begin{pmatrix}7777&0\\0&7777\end{pmatrix}}$ .
	${\begin{pmatrix}59049&0\\0&1024\\\end{pmatrix}}$ .

	Over real matrices, eigenspace can be a zero space, i.e. only consists of zero vector.
	Eigenspace must consist of infinitely many eigenvectors.
	Each basis of the eigenspace consists of linearly independent vectors.
	Suppose through diagonalization, matrix $A$ can be expressed as $A=PDP^{-1}$ in which $P$ is an invertible matrix and $D$ is a diagonal matrix, then, $P^{-1}A^{n}P$ is a diagonal matrix for each positive integer $n$ .

In the following, we will discuss some mathematical applications of diagonalization, namely deducing the sequence formula, and solving system of ordinary differential equations (ODEs).

Example. (Fibonacci sequence) Consider the Fibonacci sequence $F_{0},F_{1},\ldots$ , in which $F_{0}=0$ , $F_{1}=1$ and $F_{n+2}=F_{n+1}+F_{n}$ for each nonngeative integer $n$ . For each nonnegative integer $n$ , this recurrence relation can be described as ${\begin{pmatrix}F_{n+2}\\F_{n+1}\end{pmatrix}}={\begin{pmatrix}1&1\\1&0\\\end{pmatrix}}{\begin{pmatrix}F_{n+1}\\F_{n}\end{pmatrix}}={\begin{pmatrix}F_{n+1}+F_{n}\\F_{n}\end{pmatrix}}.$

Let $A={\begin{pmatrix}1&1\\1&0\\\end{pmatrix}}$ . Then, ${\begin{pmatrix}F_{n+1}\\F_{n}\end{pmatrix}}=A{\begin{pmatrix}F_{n}\\F_{n-1}\end{pmatrix}}=A^{2}{\begin{pmatrix}F_{n-1}\\F_{n-2}\\\end{pmatrix}}=\cdots =A^{n}{\begin{pmatrix}F_{1}\\F_{0}\end{pmatrix}}=A^{n}{\begin{pmatrix}1\\0\\\end{pmatrix}}.$

To obtain the expression for $F_{n}$ , it suffices to find the formula for $A^{n}$ , and we may find it by diagonalization.

Since ${\begin{vmatrix}1-\lambda &1\\1&-\lambda \end{vmatrix}}=0\implies (1-\lambda )(-\lambda )-1=0\implies \lambda ^{2}-\lambda -1=0\implies \lambda ={\frac {1+{\sqrt {5}}}{2}}{\text{ or }}{\frac {1-{\sqrt {5}}}{2}}.$ Let $\varphi ={\frac {1+{\sqrt {5}}}{2}}$ be the golden ratio, and $\psi ={\frac {1-{\sqrt {5}}}{2}}$ be the conjugate of the golden ratio.

For eigenvalue $\lambda ={\color {green}\varphi }$ , since for ${\begin{pmatrix}1-\varphi &1\\1&-\varphi \end{pmatrix}}\mathbf {x} =\mathbf {0} ,$ we can transform the augmented matrix representing this SLE to RREF as follows: ${\begin{pmatrix}1-\varphi &1&0\\1&-\varphi &0\\\end{pmatrix}}{\overset {\mathbf {r} _{1}\leftrightarrow \mathbf {r} _{2}}{\to }}{\begin{pmatrix}1&-\varphi &0\\1-\varphi &1&0\\\end{pmatrix}}{\overset {-(1-\varphi )\mathbf {r} _{1}+\mathbf {r} _{2}\to \mathbf {r} _{2}}{\to }}{\begin{pmatrix}1&-\varphi &0\\0&1+\varphi -\varphi ^{2}&0\\\end{pmatrix}}={\begin{pmatrix}1&-\varphi &0\\0&0&0\\\end{pmatrix}}.$ ^[3] So, the general solution is $\mathbf {x} =(a\varphi ,a)^{T}$ , and thus a basis for $E_{\varphi }$ is $\{{\color {green}(\varphi ,1)^{T}}\}$ .

For eigenvalue $\lambda ={\color {blue}\psi }$ , since ${\begin{pmatrix}1-\psi &1\\1&-\psi \end{pmatrix}}\mathbf {x} =\mathbf {0} ,$ and the RREF of the augmented matrix representing this SLE is ${\begin{pmatrix}1&-\psi &0\\0&0&0\\\end{pmatrix}}$ by symmetry^[4]. So, the general solution is $\mathbf {x} =(b\psi ,b)^{T}$ , and thus a basis for $E_{\psi }$ is $\{{\color {blue}(\psi ,1)^{T}}\}$ .

Then, we let $P={\begin{pmatrix}{\color {green}\varphi }&{\color {blue}\psi }\\{\color {green}1}&{\color {blue}1}\end{pmatrix}}$ , $D={\begin{pmatrix}{\color {green}\varphi }&0\\0&{\color {blue}\psi }\end{pmatrix}}$ . We can compute that $P^{-1}={\frac {1}{\varphi -\psi }}{\begin{pmatrix}1&-\psi \\-1&\varphi \end{pmatrix}}$ Then, $A=PDP^{-1}$ , and thus $A^{n}=PD^{n}P^{-1}={\frac {1}{\varphi -\psi }}{\begin{pmatrix}{\color {green}\varphi }&{\color {blue}\psi }\\{\color {green}1}&{\color {blue}1}\end{pmatrix}}{\begin{pmatrix}{\color {green}\varphi }^{n}&0\\0&{\color {blue}\psi }^{n}\end{pmatrix}}{\begin{pmatrix}1&-\psi \\-1&\varphi \end{pmatrix}}={\frac {1}{\varphi -\psi }}{\begin{pmatrix}\varphi ^{n+1}&\psi ^{n+1}\\\varphi ^{n}&\psi ^{n}\end{pmatrix}}{\begin{pmatrix}1&-\psi \\-1&\varphi \end{pmatrix}}={\frac {1}{\varphi -\psi }}{\begin{pmatrix}\varphi ^{n+1}-\psi ^{n+1}&-\psi \varphi ^{n+1}+\varphi \psi ^{n+1}\\\varphi ^{n}-\psi ^{n}&-\psi \varphi ^{n}+\varphi \psi ^{n}\end{pmatrix}}$ Finally, we have ${\begin{pmatrix}F_{n+1}\\F_{n}\end{pmatrix}}={\frac {1}{\varphi -\psi }}{\begin{pmatrix}\varphi ^{n+1}-\psi ^{n+1}&-\psi \varphi ^{n+1}+\varphi \psi ^{n+1}\\\varphi ^{n}-\psi ^{n}&-\psi \varphi ^{n}+\varphi \psi ^{n}\end{pmatrix}}{\begin{pmatrix}1\\0\\\end{pmatrix}}={\frac {1}{\varphi -\psi }}{\begin{pmatrix}\varphi ^{n+1}-\psi ^{n+1}\\\varphi ^{n}-\psi ^{n}\end{pmatrix}}.$ Thus, $F_{n}={\frac {\varphi ^{n}-\psi ^{n}}{\varphi -\psi }}$ in which $\varphi ={\frac {1+{\sqrt {5}}}{2}}$ and $\psi ={\frac {1-{\sqrt {5}}}{2}}$ .

Exercise. Define a sequence $a_{0},a_{1},\ldots$ in which $a_{0}=1$ and $a_{n+1}=2a_{n}$ for each nonnegative integer $n$ .

	$a_{n}=2^{n-1}$ .
	$a_{n}=8a_{n-2}$ .
	$a_{n}=32a_{n-5}$ .
	$a_{n}=128a_{n-6}$ .

2 Define another sequence $b_{0},b_{1},\ldots$ in which $b_{0}=2$ and $b_{n+1}=b_{n}+a_{n}$ for each nonnegative integer $n$ . Which of the following is (are) correct formula(s) of $b_{n}$ ?

	$b_{n}=2^{n}+1$ .
	$b_{n}=3-2^{n}$ .
	$b_{n}=2^{n}$ .
	$b_{n}=2b_{n-1}$ .
	$b_{n}=b_{n-1}+2$ .

Example. (System of ODEs) Consider the system of ODEs ${\begin{cases}{\frac {dx}{dt}}&=2x-3y\\{\frac {dy}{dt}}&=4x-5y\\\end{cases}}$ with initial conditions $(x,y)=(1,2)$ when $t=0$ .

Using the dot notation for differentiation, the system can be rewritten as ${\begin{pmatrix}{\dot {x}}\\{\dot {y}}\end{pmatrix}}=A{\begin{pmatrix}x\\y\end{pmatrix}}$ in which $A={\begin{pmatrix}2&-3\\4&-5\\\end{pmatrix}}$ . Suppose we can write $A=PDP^{-1}\iff D=P^{-1}AP$ in which $P$ is an invertible matrix and $D$ is a diagonal matrix. Let $P^{-1}={\begin{pmatrix}a&b\\c&d\\\end{pmatrix}}$ in which $a,b,c,d$ are some real numbers. Also, let ${\begin{pmatrix}u\\v\\\end{pmatrix}}=P^{-1}{\begin{pmatrix}x\\y\\\end{pmatrix}}={\begin{pmatrix}a&b\\c&d\\\end{pmatrix}}={\begin{pmatrix}ax+by\\cx+dy\\\end{pmatrix}}$ , which implies $u=ax+by$ and $v=cx+dy$ , and ${\begin{pmatrix}x\\y\\\end{pmatrix}}=P{\begin{pmatrix}u\\v\end{pmatrix}}$ . It follows that ${\dot {u}}=a{\dot {x}}+b{\dot {y}}$ and ${\dot {v}}=c{\dot {x}}+d{\dot {y}}$ . Thus, ${\begin{pmatrix}{\dot {u}}\\{\dot {v}}\end{pmatrix}}={\begin{pmatrix}a&b\\c&d\\\end{pmatrix}}{\begin{pmatrix}{\dot {x}}\\{\dot {y}}\end{pmatrix}}=P^{-1}{\begin{pmatrix}{\dot {x}}\\{\dot {y}}\end{pmatrix}}=P^{-1}A{\begin{pmatrix}x\\y\\\end{pmatrix}}=P^{-1}AP{\begin{pmatrix}u\\v\\\end{pmatrix}}=D{\begin{pmatrix}u\\v\end{pmatrix}}.$ Let $D={\begin{pmatrix}d_{1}&0\\0&d_{2}\end{pmatrix}}$ , then the system can be simplified to ${\begin{cases}{\frac {du}{dt}}&=d_{1}u\\{\frac {dv}{dt}}&=d_{2}v\end{cases}}\implies {\begin{cases}{\frac {1}{u}}\,du&=d_{1}\,dt\\{\frac {1}{v}}\,dv&=d_{2}\,dt\end{cases}}\implies {\begin{cases}\int {\frac {1}{u}}\,du&=\int d_{1}\,dt\\\int {\frac {1}{v}}\,dv&=\int d_{2}\,dt\end{cases}}\implies {\begin{cases}\ln |u|&=d_{1}t+C_{1}\\\ln |v|&=d_{2}t+C_{2}\end{cases}}\implies {\begin{cases}u&=\pm e^{d_{1}t+C_{1}}\\v&=\pm e^{d_{2}t+C_{2}}\end{cases}}\implies {\begin{cases}u&=C_{3}e^{d_{1}t}\\v&=C_{4}e^{d_{2}t}\end{cases}}$ in which $C_{1},C_{2}$ are arbitrary constants, and $C_{3}=\pm e^{C_{1}},C_{4}=\pm e^{C_{2}}$ .

Then, we find $D$ by diagonalizing $A$ : ${\begin{vmatrix}2-\lambda &-3\\4&-5-\lambda \end{vmatrix}}=0\implies (2-\lambda )(-5-\lambda )+12=0\implies \lambda ^{2}+3\lambda +2=0\implies \lambda =-1{\text{ or }}\lambda =-2.$ For eigenvalue $\lambda =-1$ , ${\begin{pmatrix}3&-3\\4&-4\end{pmatrix}}\mathbf {x} =\mathbf {0} ,$ and the general solution is $\mathbf {x} =(s,s)^{T}$ , so a basis for $E_{-1}=\{(1,1)^{T}\}$ .

For eigenvalue $\lambda =-2$ , ${\begin{pmatrix}4&-3\\4&-3\end{pmatrix}}\mathbf {x} =\mathbf {0} ,$ and the general solution is $\mathbf {x} =(t,4t/3)^{T}$ , so a basis for $E_{-2}=\{(1,4/3)^{T}\}$ .

Then, we let $P={\begin{pmatrix}1&1\\1&{\frac {4}{3}}\end{pmatrix}}$ and $D={\begin{pmatrix}-1&0\\0&-2\end{pmatrix}}$ . It follows that $P^{-1}={\begin{pmatrix}4&-3\\-3&3\end{pmatrix}}$ . Then, $A=PDP^{-1}$ .

It follows that $d_{1}=-1$ and $d_{2}=-2$ , and so $u=C_{3}e^{-t}$ and $v=C_{4}e^{-2t}$ . Imposing the initial condition $(x,y)=(1,2)$ when $t=0$ , ${\begin{pmatrix}u\\v\\\end{pmatrix}}=P^{-1}{\begin{pmatrix}1\\2\\\end{pmatrix}}={\begin{pmatrix}4&-3\\-3&3\end{pmatrix}}{\begin{pmatrix}1\\3\\\end{pmatrix}}={\begin{pmatrix}-2\\3\end{pmatrix}}$ when $t=0$ , which implies $u=-2e^{-t}$ and $v=3e^{-2t}$ . So, ${\begin{pmatrix}x\\y\end{pmatrix}}=P{\begin{pmatrix}u\\v\end{pmatrix}}={\begin{pmatrix}1&1\\1&{\frac {4}{3}}\end{pmatrix}}{\begin{pmatrix}-2e^{-t}\\3e^{-2t}\end{pmatrix}}={\begin{pmatrix}3e^{-2t}-2e^{-t}\\4e^{-2t}-2e^{-t}\end{pmatrix}}.$ Thus, the solution to this system of ODEs is $(x,y)=(3e^{-2t}-2e^{-t},4e^{-2t}-2e^{-t}).$

Exercise.

Vectors and subspaces

Introductory Linear Algebra
Eigenvalues and eigenvectors

TOC

↑ but even if $A$ has strictly less than $n$ eigenvalues, $A$ can still be diagonalizable. Actually, $A$ can at most have $n$ different eigenvalues, since the characteristic polynomial in $\lambda$ is a degree $n$ polynomial in $\lambda$ , which has $n$ roots (some of them may be repeated), by fundamental theorem of algebra
↑ It is the matrix representation of the complex number $1+2i$ .
↑ $1+\varphi -\varphi ^{2}=0$ since $1+\varphi -\varphi =-(\underbrace {\varphi ^{2}-\varphi -1} _{0})=0$
↑ In particular, $1+\varphi -\varphi ^{2}=1+\psi -\psi ^{2}=0.$ , since both $\varphi ,\psi$ satisfy the equation $\lambda ^{2}-\lambda -1=0$ .

[1] ut even if $A$ has strictly less than $n$ eigenvalues, $A$ can still be diagonalizable. Actually, $A$ can at most have $n$ different eigenvalues, since the characteristic polynomial in $\lambda$ is a degree $n$ polynomial in $\lambda$ , which has $n$ roots (some of them may be repeated), by fundamental theorem of algebra

[2] It is the matrix representation of the complex number $1+2i$ .

[3] $1+\varphi -\varphi ^{2}=0$ since $1+\varphi -\varphi =-(\underbrace {\varphi ^{2}-\varphi -1} _{0})=0$

[4] In particular, $1+\varphi -\varphi ^{2}=1+\psi -\psi ^{2}=0.$ , since both $\varphi ,\psi$ satisfy the equation $\lambda ^{2}-\lambda -1=0$ .

[1]

[2]

[3]

[4]

	$(x,y)=(2e^{2t},3e^{6t})$ .
	$(x,y)=(2e^{2t}+3e^{6t},3e^{2t}+2e^{6t})$ .
	$(x,y)=(2e^{6t}+3e^{2t},3e^{6t}+2e^{2t})$ .
	$(x,y)=(3e^{2t},2e^{6t})$ .
	$(x,y)=(3e^{6t},2e^{2t})$ .

	No such $k$ exists.
	$k=0$
	$k=1$
	$k$ can be arbitrary real numbers.

	It is inconsistent.
	$(x,y,z)=(2,3,8)$
	$(x,y,z)=(-6,1,2)$
	$(x,y,z)=(-6t,t,2t)$
	$(x,y,z)=(2t+2,3t+3,8t+8)$
	$(x,y,z)=(-6t+2,t+3,2t+8)$
	$(x,y,z)=(2t-6,3t+1,8t+2)$