Problem 1
edit
Consider the definite integral
I ( f ) = ∫ a b f ( x ) d x {\displaystyle I(f)=\int _{a}^{b}f(x)dx\!\,} Let Q n ( f ) {\displaystyle Q_{n}(f)\!\,} denote the approximation ofI ( f ) {\displaystyle I(f)\!\,} by the composite midpoint rule with n {\displaystyle n\!\,} uniform subintervals. For every j ∈ Z {\displaystyle j\in \mathbb {Z} \!\,} set
x j = a + j b − a n , x j + 1 2 = x j + x j + 1 2 {\displaystyle x_{j}=a+j{\frac {b-a}{n}},\quad \quad x_{j+{\frac {1}{2}}}={\frac {x_{j}+x_{j+1}}{2}}\!\,} .Let K ( x ) {\displaystyle K(x)\!\,} be defined by
K ( x ) = − 1 2 ( x − x j ) 2 , x j − 1 2 < x < x j + 1 2 {\displaystyle K(x)=-{\frac {1}{2}}(x-x_{j})^{2},\quad x_{j-{\frac {1}{2}}}<x<x_{j+{\frac {1}{2}}}\!\,} .Assume that f ∈ C 2 ( [ a , b ] ) {\displaystyle f\in C^{2}([a,b])\!\,} .
Problem 1a
edit
Show that the quadrature error E n ( f ) {\displaystyle E_{n}(f)\!\,} satisfies
E n ≡ Q n ( f ) − I ( f ) = ∫ a b K ( x ) f ″ ( x ) d x {\displaystyle E_{n}\equiv Q_{n}(f)-I(f)=\int _{a}^{b}K(x)f''(x)dx\!\,} Hint: Use integration by parts over each subinterval [ x j − 1 , x j ] {\displaystyle [x_{j-1},x_{j}]\!\,} .
Solution 1a
edit
Integrating ∫ K ( x ) f ″ ( x ) d x {\displaystyle \int K(x)f''(x)dx\!\,} by parts over arbitrary points p {\displaystyle p\!\,} and q {\displaystyle q\!\,} gives
∫ p q K ( x ) f ″ ( x ) d x = [ ( x − x j ) f ( x ) − 1 2 ( x − x j ) 2 f ′ ( x ) ] x = p x = q − ∫ p q f ( x ) d x {\displaystyle \int _{p}^{q}K(x)f''(x)dx=\left[(x-x_{j})f(x)-{\frac {1}{2}}(x-x_{j})^{2}f'(x)\right]_{x=p}^{x=q}-\int _{p}^{q}f(x)dx\!\,}
Since K ( x ) {\displaystyle K(x)\!\,} is defined on [ x j − 1 2 , x j + 1 2 ] {\displaystyle [x_{j-{\frac {1}{2}}},x_{j+{\frac {1}{2}}}]\!\,} we use
[ p , q ] = [ x j , x j + 1 2 ] {\displaystyle [p,q]=[x_{j},x_{j+{\frac {1}{2}}}]\!\,} and
[ p , q ] = [ x j − 1 2 , x j ] {\displaystyle [p,q]=[x_{j-{\frac {1}{2}}},x_{j}]\!\,}
Using the first interval we get
1 2 [ − 1 4 ( x j + 1 − x j ) 2 f ′ ( x j + 1 2 ) + ( x j + 1 − x j ) f ( x j + 1 2 ) ] ( 1 ) {\displaystyle {\frac {1}{2}}\left[-{\frac {1}{4}}(x_{j+1}-x_{j})^{2}f'(x_{j+{\frac {1}{2}}})+(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})\right]\qquad \mathbf {(1)} \!\,}
And for the second we get
1 2 [ 1 4 ( x j − 1 − x j ) 2 f ′ ( x j − 1 2 ) + ( x j − x j − 1 ) f ( x j − 1 2 ) ] ( 2 ) {\displaystyle {\frac {1}{2}}\left[{\frac {1}{4}}(x_{j-1}-x_{j})^{2}f'(x_{j-{\frac {1}{2}}})+(x_{j}-x_{j-1})f(x_{j-{\frac {1}{2}}})\right]\qquad \mathbf {(2)} \!\,}
Since these apply to arbitrary half-subintervals, we can rewrite equation ( 2 ) {\displaystyle \mathbf {(2)} \!\,} with the its indecies shifted by one unit. The equation for the interval [ x j + 1 2 , x j + 1 ] {\displaystyle [x_{j+{\frac {1}{2}}},x_{j+1}]\!\,} is
1 2 [ 1 4 ( x j − x j + 1 ) 2 f ′ ( x j + 1 2 ) + ( x j + 1 − x j ) f ( x j + 1 2 ) ] ( 2 ′ ) {\displaystyle {\frac {1}{2}}\left[{\frac {1}{4}}(x_{j}-x_{j+1})^{2}f'(x_{j+{\frac {1}{2}}})+(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})\right]\qquad \mathbf {(2')} \!\,}
Combining ( 1 ) {\displaystyle \mathbf {(1)} \!\,} and ( 2 ′ ) {\displaystyle \mathbf {(2')} \!\,} and writing it in the same form as the integration by parts, we have
∫ x j x j + 1 K ( x ) f ″ ( x ) d x = ( x j + 1 − x j ) f ( x j + 1 2 ) − ∫ x j x j + 1 f ( x ) d x {\displaystyle \int _{x_{j}}^{x_{j+1}}K(x)f''(x)dx=(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})-\int _{x_{j}}^{x_{j+1}}f(x)dx\!\,}
Then our last step is to sum this over all of our n {\displaystyle n\!\,} subintervals, noting that x j + 1 − x j = ( b − a ) / n {\displaystyle x_{j+1}-x_{j}=(b-a)/n\!\,}
∑ j = 0 n − 1 ∫ x j x j + 1 K ( x ) f ″ ( x ) d x = ∑ j = 0 n − 1 ( x j + 1 − x j ) f ( x j + 1 2 ) − ∑ j = 0 n − 1 ∫ x j x j + 1 f ( x ) d x ∫ a b K ( x ) f ″ ( x ) d x = b − a n ∑ j = 0 n − 1 f ( ( x j + x x + 1 ) / 2 ) − ∫ a b f ( x ) d x ∫ a b K ( x ) f ″ ( x ) d x = Q n ( f ) − I ( f ) {\displaystyle {\begin{aligned}\sum _{j=0}^{n-1}\int _{x_{j}}^{x_{j+1}}K(x)f''(x)dx&=\sum _{j=0}^{n-1}(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})-\sum _{j=0}^{n-1}\int _{x_{j}}^{x_{j+1}}f(x)dx\\\int _{a}^{b}K(x)f''(x)dx&={\frac {b-a}{n}}\sum _{j=0}^{n-1}f((x_{j}+x_{x+1})/2)-\int _{a}^{b}f(x)dx\\\int _{a}^{b}K(x)f''(x)dx&=Q_{n}(f)-I(f)\end{aligned}}} Problem 1b
edit
Solution 1b
edit
Applying the result of part (a), the triangle inequality, and pulling out the constant ‖ f ( x ) ‖ ∞ {\displaystyle \|f(x)\|_{\infty }\!\,} , we have,
| E n ( f ) | = | ∫ a b K ( x ) f ′ ′ ( x ) d x | ≤ ∫ a b | K ( x ) | | f ′ ′ ( x ) | d x ≤ ‖ f ′ ′ ( x ) ‖ ∞ ∫ a b | K ( x ) | d x ⏟ M n {\displaystyle {\begin{aligned}|E_{n}(f)|&=|\int _{a}^{b}K(x)f^{\prime \prime }(x)dx|\\&\leq \int _{a}^{b}|K(x)||f^{\prime \prime }(x)|dx\\&\leq \|f^{\prime \prime }(x)\|_{\infty }\underbrace {\int _{a}^{b}|K(x)|dx} _{M_{n}}\end{aligned}}}
M n {\displaystyle M_{n}\!\,} is some constant less than infinity since [ a , b ] {\displaystyle [a,b]\!\,} is compact and | K ( x ) | {\displaystyle |K(x)|\!\,} is continuous on each of the finite number of intervals for which it is defined.
The above inequality becomes an equality when
f ″ ( x ) = c , {\displaystyle f''(x)=c,}
where c {\displaystyle c} is any constant.
Problem 2
edit
Consider the (unshifted) Q R − {\displaystyle QR-\!\,} method for finding the eigenvalues of an invertible matrix A ∈ R n × n {\displaystyle A\in \mathbb {R} ^{n\times n}\!\,}
Problem 2a
edit
Solution 2a
edit
The QR algorithm produces a sequence of similar matrices { A i } {\displaystyle \{A_{i}\}\!\,} whose limit tends towards being upper triangular or nearly upper triangular. This is advantageous since the eigenvalues of an upper triangular matrix lie on its diagonal.
i = 0
A_1 = A
while ( error > tolerance )
A_i=Q_i R_i (QR decomposition/factorization)
A_{i+1}=R_i Q_i (multiply R and Q, the reverse multiplication)
i=i+1 (increment)
Problem 2b
edit
Show that each of the matrices A n {\displaystyle A_{n}\!\,} generated by this algorithm are orthogonally similar to A {\displaystyle A\!\,} .
Solution 2b
edit
From the factor step (QR decomposition) of the algorithm, we have
A i = Q i R i {\displaystyle A_{i}=Q_{i}R_{i}\!\,}
which implies
Q i − 1 A i = R i {\displaystyle Q_{i}^{-1}A_{i}=R_{i}\!\,}
Substituting into the reverse multiply step, we have
A i + 1 = R i Q i = Q i − 1 A i Q i {\displaystyle {\begin{aligned}A_{i+1}&=R_{i}Q_{i}\\&=Q_{i}^{-1}A_{i}Q_{i}\end{aligned}}}
Problem 2c
edit
Show that if A {\displaystyle A\!\,} is upper Hessenberg then so are each of the matrices A n {\displaystyle A_{n}\!\,} generated by this algorithm.
Solution 2c
edit
A series of Given's Rotations matrices pre-multiplying A {\displaystyle A\!\,} , a upper Heisenberg matrix, yield an upper triangular matrixR {\displaystyle R\!\,} i.e.
G ( n − 1 , n ) ⋯ G ( 2 , 3 ) G ( 1 , 2 ) A = R {\displaystyle G_{(n-1,n)}\cdots G_{(2,3)}G_{(1,2)}A=R\!\,}
Since Givens Rotations matrices are each orthogonal, we can write
A = ( G ( n − 1 , n ) ⋯ G ( 2 , 3 ) G ( 1 , 2 ) ) T ⏟ Q R {\displaystyle A=\underbrace {(G_{(n-1,n)}\cdots G_{(2,3)}G_{(1,2)})^{T}} _{Q}R\!\,}
i.e.
A = Q R {\displaystyle A=QR\!\,}
If we let A 0 := A {\displaystyle A_{0}:=A\!\,} , we have A 0 = Q 0 R 0 {\displaystyle A_{0}=Q_{0}R_{0}\!\,} ,
or more generally for k = 1 , 2 , 3 , … {\displaystyle k=1,2,3,\ldots \!\,}
A k = Q k R k {\displaystyle A_{k}=Q_{k}R_{k}\!\,} .
In each case, the sequence of Given's Rotations matrices that compose Q {\displaystyle Q\!\,} have the following structure
Q = G ( 1 , 2 ) T G 2 , 3 T ⋯ G ( n − 1 , n ) T = ( ∗ ∗ 0 0 ⋯ 0 ∗ ∗ 0 0 ⋯ 0 0 0 1 0 ⋯ 0 0 0 0 1 ⋯ 0 ⋮ ⋮ ⋮ ⋱ ⋱ ⋮ 0 0 ⋯ ⋯ 0 1 ) ( 1 0 0 0 ⋯ 0 0 ∗ ∗ 0 ⋯ 0 0 ∗ ∗ 0 ⋯ 0 0 0 0 1 ⋯ 0 ⋮ ⋮ ⋮ ⋱ ⋱ ⋮ 0 0 ⋯ ⋯ 0 1 ) ⋯ ( 1 0 0 0 ⋯ 0 0 1 0 0 ⋯ 0 ⋮ ⋱ ⋱ ⋱ ⋱ 0 0 0 0 1 0 0 0 0 0 0 ∗ ∗ 0 0 0 0 ∗ ∗ ) = ( ∗ ∗ ∗ ⋯ ∗ ∗ ∗ ∗ ⋯ ∗ 0 ∗ ∗ ⋯ ∗ ⋮ ⋱ ⋱ ⋱ ⋮ 0 0 0 ∗ ∗ ) {\displaystyle {\begin{aligned}Q&=G_{(1,2)}^{T}G_{2,3}^{T}\cdots G_{(n-1,n)}^{T}\\&={\begin{pmatrix}*&*&0&0&\cdots &0\\\!\,*&*&0&0&\cdots &0\\0&0&1&0&\cdots &0\\0&0&0&1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&\cdots &\cdots &0&1\\\end{pmatrix}}{\begin{pmatrix}1&0&0&0&\cdots &0\\0&*&*&0&\cdots &0\\0&*&*&0&\cdots &0\\0&0&0&1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&\cdots &\cdots &0&1\\\end{pmatrix}}\cdots {\begin{pmatrix}1&0&0&0&\cdots &0\\0&1&0&0&\cdots &0\\\vdots &\ddots &\ddots &\ddots &\ddots &0\\0&0&0&1&0&0\\0&0&0&0&*&*\\0&0&0&0&*&*\\\end{pmatrix}}\\&={\begin{pmatrix}*&*&*&\cdots &*\\\!\,*&*&*&\cdots &*\\0&*&*&\cdots &*\\\vdots &\ddots &\ddots &\ddots &\vdots \\0&0&0&*&*\\\end{pmatrix}}\end{aligned}}}
So Q {\displaystyle Q\!\,} is upper Hessenberg.
From the algorithm, we have
A k + 1 = R k Q k = ( ∗ ∗ ∗ ⋯ ∗ 0 ∗ ∗ ⋯ ∗ 0 0 ∗ ⋯ ∗ ⋮ ⋮ ⋱ ⋱ ⋮ 0 0 0 0 ∗ ) ( ∗ ∗ ∗ ⋯ ∗ ∗ ∗ ∗ ⋯ ∗ 0 ∗ ∗ ⋯ ∗ ⋮ ⋱ ⋱ ⋱ ⋮ 0 0 0 ∗ ∗ ) {\displaystyle {\begin{aligned}A_{k+1}&=R_{k}Q_{k}\\&={\begin{pmatrix}*&*&*&\cdots &*\\0&*&*&\cdots &*\\0&0&*&\cdots &*\\\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&0&0&*\\\end{pmatrix}}{\begin{pmatrix}*&*&*&\cdots &*\\\!\,*&*&*&\cdots &*\\0&*&*&\cdots &*\\\vdots &\ddots &\ddots &\ddots &\vdots \\0&0&0&*&*\\\end{pmatrix}}\end{aligned}}}
We conclude A k + 1 {\displaystyle A_{k+1}\!\,} is upper Hessenberg because for j = 1 , 2 , … , n − 2 {\displaystyle j=1,2,\ldots ,n-2\!\,} the j {\displaystyle j\!\,} th column of A k + 1 {\displaystyle A_{k+1}\!\,} is a linear combination of the first j + 1 {\displaystyle j+1\!\,} columns of R k {\displaystyle R_{k}\!\,} since Q k {\displaystyle Q_{k}\!\,} is also upper Hessenberg.
Problem 2d
edit
Let
( 3 3 1 5 ) {\displaystyle {\begin{pmatrix}3&3\\1&5\end{pmatrix}}}
For this A {\displaystyle A\!\,} the sequence { A n } {\displaystyle \{A_{n}\}\!\,} has a limit. Find this limit. Give your reasoning.
Solution 2d
edit
The eigenvalues of A {\displaystyle A\!\,} can be computed. They are 6 {\displaystyle 6\!\,} and 2 {\displaystyle 2\!\,} . Furthermore, the result of matrix multiplies in the algorithm shows that the diagonal difference, ( a 1 , 2 − a 2 , 1 ) {\displaystyle (a_{1,2}-a_{2,1})\!\,} , is constant for all i {\displaystyle i\!\,} .
Since the limit of A {\displaystyle A\!\,} is an upper triangular matrix with the eigenvalues of A {\displaystyle A\!\,} on the diagonal, the limit is
( 6 2 0 2 ) {\displaystyle {\begin{pmatrix}6&2\\0&2\end{pmatrix}}\!\,} Problem 3
edit
Let A ∈ R N × N {\displaystyle A\in \mathbb {R} ^{N\times N}\!\,} be symmetric and positive definite. Let b ∈ R N {\displaystyle b\in \mathbb {R} ^{N}\!\,} . Consider solving A x = b {\displaystyle Ax=b\!\,} using the conjugate gradient method. The n t h {\displaystyle n^{th}\!\,} iterate x ( n ) {\displaystyle x^{(n)}\!\,} then satisfies
‖ x ( n ) − x ‖ A ≤ ‖ y − x ‖ A {\displaystyle \|x^{(n)}-x\|_{A}\leq \|y-x\|_{A}\!\,} for every y ∈ x ( 0 ) + K n ( r ( 0 ) , A ) {\displaystyle y\in x^{(0)}+{\mathcal {K}}_{n}(r^{(0)},A)\!\,} ,
where ‖ ⋅ ‖ A {\displaystyle \|\cdot \|_{A}\!\,} denotes the vector A-norm, r ( 0 ) {\displaystyle r^{(0)}\!\,} is the initial residual, and
K n ( r ( 0 ) , A ) = span { r ( 0 ) , A r ( 0 ) , … , A n − 1 r ( 0 ) } {\displaystyle {\mathcal {K}}_{n}(r^{(0)},A)={\mbox{span}}\{r^{(0)},Ar^{(0)},\dots ,A^{n-1}r^{(0)}\}\!\,} .
Problem 3a
edit
Solution 3a
edit
We know that ‖ x ( n ) − x ‖ A ≤ ‖ y − x ‖ A {\displaystyle \|x^{(n)}-x\|_{A}\leq \|y-x\|_{A}\!\,} for every y ∈ x ( 0 ) + K n ( r ( 0 ) , A ) {\displaystyle y\in x^{(0)}+{\mathcal {K}}_{n}(r^{(0)},A)\!\,} , so if we can choose a y ∈ x ( 0 ) + K n ( r ( 0 ) , A ) {\displaystyle y\in x^{(0)}+{\mathcal {K}}_{n}(r^{(0)},A)\!\,} such that
‖ x ( n ) − x ‖ A ≤ ‖ y − x ‖ A ≤ ‖ p n ( A ) ‖ A ‖ x ( 0 ) − x ‖ A {\displaystyle \|x^{(n)}-x\|_{A}\leq \|y-x\|_{A}\leq \|p_{n}(A)\|_{A}\|x^{(0)}-x\|_{A}\!\,} ,
then we can solve part a.
Rewrite r^(0)
edit
First note from definition of r ( 0 ) {\displaystyle r^{(0)}\!\,}
r ( 0 ) = A x ( 0 ) − b = A x ( 0 ) − A x = A ( x ( 0 ) − x ) {\displaystyle {\begin{aligned}r^{(0)}&=Ax^{(0)}-b\\&=Ax^{(0)}-Ax\\&=A(x^{(0)}-x)\end{aligned}}}
Rewrite Krylov space
edit
Therefore, we can rewrite K n ( r ( 0 ) , A ) {\displaystyle {\mathcal {K}}_{n}(r^{(0)},A)\!\,} as follows:
K n ( r ( 0 ) , A ) = K n ( A ( x ( 0 ) − x ) , A ) = span { A ( x ( 0 ) − x ) , A 2 ( x ( 0 ) − x ) , … , A n ( x ( 0 ) − x ) } {\displaystyle {\begin{aligned}{\mathcal {K}}_{n}(r^{(0)},A)&={\mathcal {K}}_{n}(A(x^{(0)}-x),A)\\&={\mbox{span}}\{A(x^{(0)}-x),A^{2}(x^{(0)}-x),\ldots ,A^{n}(x^{(0)}-x)\}\end{aligned}}}
Write y explicitly
edit
We can then write y {\displaystyle y\!\,} explicitly as follows:
y ∈ x 0 + K n ( r ( 0 ) , A ) ∈ x 0 + K n ( A ( x ( 0 ) − x ) , A ) = x 0 + α 1 A ( x ( 0 ) − x ) + α 2 A 2 ( x ( 0 ) − x ) + … + α n A n ( x ( 0 ) − x ) for arbitrary real numbers α i i = 1 , 2 , … , n {\displaystyle {\begin{aligned}y&\in x_{0}+{\mathcal {K}}_{n}(r^{(0)},A)\\&\in x_{0}+{\mathcal {K}}_{n}(A(x^{(0)}-x),A)\\&=x_{0}+\alpha _{1}A(x^{(0)}-x)+\alpha _{2}A^{2}(x^{(0)}-x)+\ldots +\alpha _{n}A^{n}(x^{(0)}-x)\\&{\mbox{for arbitrary real numbers }}\alpha _{i}\quad i=1,2,\ldots ,n\end{aligned}}}
Substitute and Apply Inequality
edit
We substitute y {\displaystyle y\!\,} into the hypothesis inequality and apply a norm inequality of matrix norms to get the desired result.
‖ x ( n ) − x ‖ A ≤ ‖ y − x ‖ A = ‖ ( x 0 − x ) + α 1 A ( x ( 0 ) − x ) + α 2 A 2 ( x ( 0 ) − x ) + … + α n A n ( x ( 0 ) − x ) ‖ A = ‖ α n A n ( x ( 0 ) − x ) + α n − 1 A n − 1 ( x ( 0 ) − x ) + … + α 2 A 2 ( x ( 0 ) − x ) + α 1 A ( x ( 0 ) − x ) + ( x 0 − x ) ‖ A = ‖ P ( A ) ( x ( 0 ) − x ) ‖ A ≤ ‖ P ( A ) ‖ A ‖ x ( 0 ) − x ‖ A {\displaystyle {\begin{aligned}\|x^{(n)}-x\|_{A}&\leq \|y-x\|_{A}\\&=\|(x_{0}-x)+\alpha _{1}A(x^{(0)}-x)+\alpha _{2}A^{2}(x^{(0)}-x)+\ldots +\alpha _{n}A^{n}(x^{(0)}-x)\|_{A}\\&=\|\alpha _{n}A^{n}(x^{(0)}-x)+\alpha _{n-1}A^{n-1}(x^{(0)}-x)+\ldots +\alpha _{2}A^{2}(x^{(0)}-x)+\alpha _{1}A(x^{(0)}-x)+(x_{0}-x)\|_{A}\\&=\|P(A)(x^{(0)}-x)\|_{A}\\&\leq \|P(A)\|_{A}\|x^{(0)}-x\|_{A}\end{aligned}}}
Problem 3b
edit
Let T n ( z ) {\displaystyle T_{n}(z)\!\,} denote the n t h {\displaystyle n^{th}\!\,} Chebyshev polynomial. Let λ m i n {\displaystyle \lambda _{min}\!\,} and λ m a x {\displaystyle \lambda _{max}\!\,} denote respectively the smallest and largest eigenvalues of A {\displaystyle A\!\,} . Apply the result of part (a) to
p n ( z ) = T n ( λ m a x + λ m i n − 2 z λ m a x − λ m i n ) T n ( λ m a x + λ m i n λ m a x − λ m i n ) {\displaystyle p_{n}(z)={\frac {T_{n}({\frac {\lambda _{max}+\lambda _{min}-2z}{\lambda _{max}-\lambda _{min}}})}{T_{n}({\frac {\lambda _{max}+\lambda _{min}}{\lambda _{max}-\lambda _{min}}})}}} to show that
‖ x ( n ) − x ‖ A ≤ 1 T n ( λ m a x + λ m i n λ m a x − λ m i n ) ‖ x ( 0 ) − x ‖ A {\displaystyle \|x^{(n)}-x\|_{A}\leq {\frac {1}{T_{n}({\frac {\lambda _{max}+\lambda _{min}}{\lambda _{max}-\lambda _{min}}})}}\|x^{(0)}-x\|_{A}} .You can use without proof the fact that
‖ p n ( A ) ‖ A = max { | p n ( z ) | : z ∈ S p { A } } {\displaystyle \|p_{n}(A)\|_{A}=\max\{|p_{n}(z)|:z\in Sp\{A\}\}\!\,} ,where S p ( A ) {\displaystyle Sp(A)\!\,} denotes the set of eigenvalues of A {\displaystyle A\!\,} , and the facts that for every n ∈ N {\displaystyle n\in \mathbb {N} \!\,} the polynomial T n ( z ) {\displaystyle T_{n}(z)\!\,} has degree n {\displaystyle n\!\,} , is positive for z > 1 {\displaystyle z>1\!\,} , and satisfies
T n ( cos ( θ ) ) = cos ( n θ ) for all θ ∈ R {\displaystyle T_{n}(\cos(\theta ))=\cos(n\theta )\quad \quad {\mbox{for all}}\;\theta \in \mathbb {R} \!\,} .
Solution 3b
edit
Overview
edit
We want to show that
‖ p n ( A ) ‖ A = 1 T n ( λ max + λ min λ max − λ min ) {\displaystyle \|p_{n}(A)\|_{A}={\frac {1}{T_{n}({\frac {\lambda _{\max }+\lambda _{\min }}{\lambda _{\max }-\lambda _{\min }}})}}\!\,}
Maximize Numerator of p_n(z)
edit
By hypothesis,
‖ p n ( A ) ‖ A = max { | p n ( z ) | : z ∈ S p { A } } {\displaystyle \|p_{n}(A)\|_{A}=\max\{|p_{n}(z)|:z\in Sp\{A\}\}\!\,}
Since only the numerator of p n ( z ) {\displaystyle p_{n}(z)\!\,} depends on z {\displaystyle z\!\,} , we only need to maximize the numerator in order to maximize | p n ( z ) | {\displaystyle |p_{n}(z)|\!\,} . That is find
z : max z ∈ [ λ min , λ max ] | T n ( λ max + λ min − 2 z λ max − λ min ) | {\displaystyle z:\max _{z\in [\lambda _{\min },\lambda _{\max }]}\left|T_{n}\left({\frac {\lambda _{\max }+\lambda _{\min }-2z}{\lambda _{\max }-\lambda _{\min }}}\right)\right|\!\,}
Rewrite T_n
edit
Let cos ( θ ) = x {\displaystyle \cos(\theta )=x\!\,} . Then
θ = arccos ( x ) {\displaystyle \theta =\arccos(x)\!\,}
Hence,
T n ( cos θ ) = T n ( x ) = cos ( n θ ) = cos ( n arccos ( x ) ) {\displaystyle {\begin{aligned}T_{n}(\cos \theta )&=T_{n}(x)\\&=\cos(n\theta )\\&=\cos(n\arccos(x))\end{aligned}}}
so
T n ( x ) = cos ( n arccos ( x ) ) {\displaystyle T_{n}(x)=\cos(n\arccos(x))\!\,}
Max of T_n is 1
edit
Denote the argument of T n {\displaystyle T_{n}\!\,} as x ( z ) {\displaystyle x(z)\!\,} since the argument depends on z {\displaystyle z\!\,} . Hence,
x ( z ) = λ max + λ min − 2 z λ max − λ min {\displaystyle x(z)={\frac {\lambda _{\max }+\lambda _{\min }-2z}{\lambda _{\max }-\lambda _{\min }}}\!\,} ,
Then,
x ( λ min ) = λ max − λ min λ max − λ min = 1 {\displaystyle x(\lambda _{\min })={\frac {\lambda _{\max }-\lambda _{\min }}{\lambda _{\max }-\lambda _{\min }}}=1\!\,}
x ( λ max ) = λ min − λ max λ max − λ min = − 1 {\displaystyle x(\lambda _{\max })={\frac {\lambda _{\min }-\lambda _{\max }}{\lambda _{\max }-\lambda _{\min }}}=-1\!\,}
Thus x ( z ) ∈ [ − 1 , 1 ] {\displaystyle x(z)\in [-1,1]\!\,} .
Now, since arccos ( x ) {\displaystyle \;\arccos(x)\;} is real for x ∈ [ − 1 , 1 ] {\displaystyle x\in [-1,1]\!\,} ,
− 1 ≤ cos ( n arccos ( x ) ⏞ ∈ R ) ⏟ T n ( x ) ≤ 1 {\displaystyle -1\leq \underbrace {\cos(n\overbrace {\arccos(x)} ^{\in \mathbb {R} })} _{T_{n}(x)}\leq 1\!\,}
Hence,
max x ∈ [ − 1 , 1 ] T n ( x ) = max x ∈ [ − 1 , 1 ] | cos ( n arccos ( x ) ) | = 1 {\displaystyle \max _{x\in [-1,1]}T_{n}(x)=\max _{x\in [-1,1]}|\cos(n\arccos(x))|=1\!\,} Show T_n(1)=1
edit
Let z = λ min {\displaystyle z=\lambda _{\min }\!\,} , then
| T n ( λ max + λ min − 2 ( λ min ) λ max − λ min ) | = | T n ( 1 ) | {\displaystyle \left|T_{n}\left({\frac {\lambda _{\max }+\lambda _{\min }-2(\lambda _{\min })}{\lambda _{\max }-\lambda _{\min }}}\right)\right|=|T_{n}(1)|\!\,}
Using our formula we have,
| T n ( 1 ) | = | cos ( n ⋅ arccos ( 1 ) ) | = | cos ( n ⋅ 2 π k ) | k ∈ Z = 1 {\displaystyle {\begin{aligned}|T_{n}(1)|&=|\cos(n\cdot \arccos(1))|\\&=|\cos(n\cdot 2\pi k)|\quad k\in \mathbf {Z} \\&=1\end{aligned}}}
In other words, if z = λ min {\displaystyle z=\lambda _{\min }\!\,} , T n {\displaystyle T_{n}\!\,} achieves its maximum value of 1 {\displaystyle 1\!\,} .