Linear Algebra/Topic: The Method of Powers
In practice, calculating eigenvalues and eigenvectors is a difficult problem. Finding, and solving, the characteristic polynomial of the large matrices often encountered in applications is too slow and too hard. Other techniques, indirect ones that avoid the characteristic polynomial, are used. Here we shall see such a method that is suitable for large matrices that are "sparse" (the great majority of the entries are zero).
Suppose that the matrix has the distinct eigenvalues , , ..., . Then has a basis that is composed of the associated eigenvectors . For any , where , iterating on gives these.
If one of the eigenvalues, say, , has a larger absolute value than any of the other eigenvalues then its term will dominate the above expression. Put another way, dividing through by gives this,
and, because is assumed to have the largest absolute value, as gets larger the fractions go to zero. Thus, the entire expression goes to .
That is (as long as is not zero), as increases, the vectors will tend toward the direction of the eigenvectors associated with the dominant eigenvalue, and, consequently, the ratios of the lengths will tend toward that dominant eigenvalue.
For example, (sample computer code for this follows the exercises), because the matrix
is triangular, its eigenvalues are just the entries on the diagonal, and . Arbitrarily taking to have the components and gives
and the ratio between the lengths of the last two is .
Two implementation issues must be addressed. The first issue is that, instead of finding the powers of and applying them to , we will compute as and then compute as , etc. (i.e., we never separately calculate , , etc.). These matrix-vector products can be done quickly even if is large, provided that it is sparse. The second issue is that, to avoid generating numbers that are so large that they overflow our computer's capability, we can normalize the 's at each step. For instance, we can divide each by its length (other possibilities are to divide it by its largest component, or simply by its first component). We thus implement this method by generating
until we are satisfied. Then the vector is an approximation of an eigenvector, and the approximation of the dominant eigenvalue is the ratio .
One way we could be "satisfied" is to iterate until our approximation of the eigenvalue settles down. We could decide, for instance, to stop the iteration process not after some fixed number of steps, but instead when differs from by less than one percent, or when they agree up to the second significant digit.
The rate of convergence is determined by the rate at which the powers of go to zero, where is the eigenvalue of second largest norm. If that ratio is much less than one then convergence is fast, but if it is only slightly less than one then convergence can be quite slow. Consequently, the method of powers is not the most commonly used way of finding eigenvalues (although it is the simplest one, which is why it is here as the illustration of the possibility of computing eigenvalues without solving the characteristic polynomial). Instead, there are a variety of methods that generally work by first replacing the given matrix with another that is similar to it and so has the same eigenvalues, but is in some reduced form such as tridiagonal form: the only nonzero entries are on the diagonal, or just above or below it. Then special techniques can be used to find the eigenvalues. Once the eigenvalues are known, the eigenvectors of can be easily computed. These other methods are outside of our scope. A good reference is (Goult et al. 1975).
Exercises
edit- Problem 1
Use ten iterations to estimate the largest eigenvalue of these matrices, starting from the vector with components and . Compare the answer with the one obtained by solving the characteristic equation.
- Problem 2
Redo the prior exercise by iterating until has absolute value less than At each step, normalize by dividing each vector by its length. How many iterations are required? Are the answers significantly different?
- Problem 3
Use ten iterations to estimate the largest eigenvalue of these matrices, starting from the vector with components , , and . Compare the answer with the one obtained by solving the characteristic equation.
- Problem 4
Redo the prior exercise by iterating until has absolute value less than . At each step, normalize by dividing each vector by its length. How many iterations does it take? Are the answers significantly different?
- Problem 5
What happens if ? That is, what happens if the initial vector does not to have any component in the direction of the relevant eigenvector?
- Problem 6
How can the method of powers be adopted to find the smallest eigenvalue?
This is the code for the computer algebra system Octave that was used to do the calculation above. (It has been lightly edited to remove blank lines, etc.)
Computer Code
>T=[3, 0;
8, -1]
T=
3 0
8 -1
>v0=[1; 2]
v0=
1
1
>v1=T*v0
v1=
3
7
>v2=T*v1
v2=
9
17
>T9=T**9
T9=
19683 0
39368 -1
>T10=T**10
T10=
59049 0
118096 1
>v9=T9*v0
v9=
19683
39367
>v10=T10*v0
v10=
59049
118096
>norm(v10)/norm(v9)
ans=2.9999
Remark: we are ignoring the power of Octave here; there are built-in functions to automatically apply quite sophisticated methods to find eigenvalues and eigenvectors. Instead, we are using just the system as a calculator.
References
edit- Goult, R.J.; Hoskins, R.F.; Milner, J.A.; Pratt, M.J. (1975), Computational Methods in Linear Algebra, Wiley.