Consider the definite integral
I
(
f
)
=
∫
a
b
f
(
x
)
d
x
{\displaystyle I(f)=\int _{a}^{b}f(x)dx\!\,}
Let
Q
n
(
f
)
{\displaystyle Q_{n}(f)\!\,}
denote the approximation of
I
(
f
)
{\displaystyle I(f)\!\,}
by the composite midpoint rule with
n
{\displaystyle n\!\,}
uniform subintervals. For every
j
∈
Z
{\displaystyle j\in \mathbb {Z} \!\,}
set
x
j
=
a
+
j
b
−
a
n
,
x
j
+
1
2
=
x
j
+
x
j
+
1
2
{\displaystyle x_{j}=a+j{\frac {b-a}{n}},\quad \quad x_{j+{\frac {1}{2}}}={\frac {x_{j}+x_{j+1}}{2}}\!\,}
.
Let
K
(
x
)
{\displaystyle K(x)\!\,}
be defined by
K
(
x
)
=
−
1
2
(
x
−
x
j
)
2
,
x
j
−
1
2
<
x
<
x
j
+
1
2
{\displaystyle K(x)=-{\frac {1}{2}}(x-x_{j})^{2},\quad x_{j-{\frac {1}{2}}}<x<x_{j+{\frac {1}{2}}}\!\,}
.
Assume that
f
∈
C
2
(
[
a
,
b
]
)
{\displaystyle f\in C^{2}([a,b])\!\,}
.
Show that the quadrature error
E
n
(
f
)
{\displaystyle E_{n}(f)\!\,}
satisfies
E
n
≡
Q
n
(
f
)
−
I
(
f
)
=
∫
a
b
K
(
x
)
f
″
(
x
)
d
x
{\displaystyle E_{n}\equiv Q_{n}(f)-I(f)=\int _{a}^{b}K(x)f''(x)dx\!\,}
Hint: Use integration by parts over each subinterval
[
x
j
−
1
,
x
j
]
{\displaystyle [x_{j-1},x_{j}]\!\,}
.
Integrating
∫
K
(
x
)
f
″
(
x
)
d
x
{\displaystyle \int K(x)f''(x)dx\!\,}
by parts over arbitrary points
p
{\displaystyle p\!\,}
and
q
{\displaystyle q\!\,}
gives
∫
p
q
K
(
x
)
f
″
(
x
)
d
x
=
[
(
x
−
x
j
)
f
(
x
)
−
1
2
(
x
−
x
j
)
2
f
′
(
x
)
]
x
=
p
x
=
q
−
∫
p
q
f
(
x
)
d
x
{\displaystyle \int _{p}^{q}K(x)f''(x)dx=\left[(x-x_{j})f(x)-{\frac {1}{2}}(x-x_{j})^{2}f'(x)\right]_{x=p}^{x=q}-\int _{p}^{q}f(x)dx\!\,}
Since
K
(
x
)
{\displaystyle K(x)\!\,}
is defined on
[
x
j
−
1
2
,
x
j
+
1
2
]
{\displaystyle [x_{j-{\frac {1}{2}}},x_{j+{\frac {1}{2}}}]\!\,}
we use
[
p
,
q
]
=
[
x
j
,
x
j
+
1
2
]
{\displaystyle [p,q]=[x_{j},x_{j+{\frac {1}{2}}}]\!\,}
and
[
p
,
q
]
=
[
x
j
−
1
2
,
x
j
]
{\displaystyle [p,q]=[x_{j-{\frac {1}{2}}},x_{j}]\!\,}
Using the first interval we get
1
2
[
−
1
4
(
x
j
+
1
−
x
j
)
2
f
′
(
x
j
+
1
2
)
+
(
x
j
+
1
−
x
j
)
f
(
x
j
+
1
2
)
]
(
1
)
{\displaystyle {\frac {1}{2}}\left[-{\frac {1}{4}}(x_{j+1}-x_{j})^{2}f'(x_{j+{\frac {1}{2}}})+(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})\right]\qquad \mathbf {(1)} \!\,}
And for the second we get
1
2
[
1
4
(
x
j
−
1
−
x
j
)
2
f
′
(
x
j
−
1
2
)
+
(
x
j
−
x
j
−
1
)
f
(
x
j
−
1
2
)
]
(
2
)
{\displaystyle {\frac {1}{2}}\left[{\frac {1}{4}}(x_{j-1}-x_{j})^{2}f'(x_{j-{\frac {1}{2}}})+(x_{j}-x_{j-1})f(x_{j-{\frac {1}{2}}})\right]\qquad \mathbf {(2)} \!\,}
Since these apply to arbitrary half-subintervals, we can rewrite equation
(
2
)
{\displaystyle \mathbf {(2)} \!\,}
with the its indecies shifted by one unit. The equation for the interval
[
x
j
+
1
2
,
x
j
+
1
]
{\displaystyle [x_{j+{\frac {1}{2}}},x_{j+1}]\!\,}
is
1
2
[
1
4
(
x
j
−
x
j
+
1
)
2
f
′
(
x
j
+
1
2
)
+
(
x
j
+
1
−
x
j
)
f
(
x
j
+
1
2
)
]
(
2
′
)
{\displaystyle {\frac {1}{2}}\left[{\frac {1}{4}}(x_{j}-x_{j+1})^{2}f'(x_{j+{\frac {1}{2}}})+(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})\right]\qquad \mathbf {(2')} \!\,}
Combining
(
1
)
{\displaystyle \mathbf {(1)} \!\,}
and
(
2
′
)
{\displaystyle \mathbf {(2')} \!\,}
and writing it in the same form as the integration by parts, we have
∫
x
j
x
j
+
1
K
(
x
)
f
″
(
x
)
d
x
=
(
x
j
+
1
−
x
j
)
f
(
x
j
+
1
2
)
−
∫
x
j
x
j
+
1
f
(
x
)
d
x
{\displaystyle \int _{x_{j}}^{x_{j+1}}K(x)f''(x)dx=(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})-\int _{x_{j}}^{x_{j+1}}f(x)dx\!\,}
Then our last step is to sum this over all of our
n
{\displaystyle n\!\,}
subintervals, noting that
x
j
+
1
−
x
j
=
(
b
−
a
)
/
n
{\displaystyle x_{j+1}-x_{j}=(b-a)/n\!\,}
∑
j
=
0
n
−
1
∫
x
j
x
j
+
1
K
(
x
)
f
″
(
x
)
d
x
=
∑
j
=
0
n
−
1
(
x
j
+
1
−
x
j
)
f
(
x
j
+
1
2
)
−
∑
j
=
0
n
−
1
∫
x
j
x
j
+
1
f
(
x
)
d
x
∫
a
b
K
(
x
)
f
″
(
x
)
d
x
=
b
−
a
n
∑
j
=
0
n
−
1
f
(
(
x
j
+
x
x
+
1
)
/
2
)
−
∫
a
b
f
(
x
)
d
x
∫
a
b
K
(
x
)
f
″
(
x
)
d
x
=
Q
n
(
f
)
−
I
(
f
)
{\displaystyle {\begin{aligned}\sum _{j=0}^{n-1}\int _{x_{j}}^{x_{j+1}}K(x)f''(x)dx&=\sum _{j=0}^{n-1}(x_{j+1}-x_{j})f(x_{j+{\frac {1}{2}}})-\sum _{j=0}^{n-1}\int _{x_{j}}^{x_{j+1}}f(x)dx\\\int _{a}^{b}K(x)f''(x)dx&={\frac {b-a}{n}}\sum _{j=0}^{n-1}f((x_{j}+x_{x+1})/2)-\int _{a}^{b}f(x)dx\\\int _{a}^{b}K(x)f''(x)dx&=Q_{n}(f)-I(f)\end{aligned}}}
Applying the result of part (a), the triangle inequality, and pulling out the constant
‖
f
(
x
)
‖
∞
{\displaystyle \|f(x)\|_{\infty }\!\,}
, we have,
|
E
n
(
f
)
|
=
|
∫
a
b
K
(
x
)
f
′
′
(
x
)
d
x
|
≤
∫
a
b
|
K
(
x
)
|
|
f
′
′
(
x
)
|
d
x
≤
‖
f
′
′
(
x
)
‖
∞
∫
a
b
|
K
(
x
)
|
d
x
⏟
M
n
{\displaystyle {\begin{aligned}|E_{n}(f)|&=|\int _{a}^{b}K(x)f^{\prime \prime }(x)dx|\\&\leq \int _{a}^{b}|K(x)||f^{\prime \prime }(x)|dx\\&\leq \|f^{\prime \prime }(x)\|_{\infty }\underbrace {\int _{a}^{b}|K(x)|dx} _{M_{n}}\end{aligned}}}
M
n
{\displaystyle M_{n}\!\,}
is some constant less than infinity since
[
a
,
b
]
{\displaystyle [a,b]\!\,}
is compact and
|
K
(
x
)
|
{\displaystyle |K(x)|\!\,}
is continuous on each of the finite number of intervals for which it is defined.
The above inequality becomes an equality when
f
″
(
x
)
=
c
,
{\displaystyle f''(x)=c,}
where
c
{\displaystyle c}
is any constant.
Consider the (unshifted)
Q
R
−
{\displaystyle QR-\!\,}
method for finding the eigenvalues of an invertible matrix
A
∈
R
n
×
n
{\displaystyle A\in \mathbb {R} ^{n\times n}\!\,}
The QR algorithm produces a sequence of similar matrices
{
A
i
}
{\displaystyle \{A_{i}\}\!\,}
whose limit tends towards being upper triangular or nearly upper triangular. This is advantageous since the eigenvalues of an upper triangular matrix lie on its diagonal.
i = 0
A_1 = A
while ( error > tolerance )
A_i=Q_i R_i (QR decomposition/factorization)
A_{i+1}=R_i Q_i (multiply R and Q, the reverse multiplication)
i=i+1 (increment)
Show that each of the matrices
A
n
{\displaystyle A_{n}\!\,}
generated by this algorithm are orthogonally similar to
A
{\displaystyle A\!\,}
.
From the factor step (QR decomposition) of the algorithm, we have
A
i
=
Q
i
R
i
{\displaystyle A_{i}=Q_{i}R_{i}\!\,}
which implies
Q
i
−
1
A
i
=
R
i
{\displaystyle Q_{i}^{-1}A_{i}=R_{i}\!\,}
Substituting into the reverse multiply step, we have
A
i
+
1
=
R
i
Q
i
=
Q
i
−
1
A
i
Q
i
{\displaystyle {\begin{aligned}A_{i+1}&=R_{i}Q_{i}\\&=Q_{i}^{-1}A_{i}Q_{i}\end{aligned}}}
Show that if
A
{\displaystyle A\!\,}
is upper Hessenberg then so are each of the matrices
A
n
{\displaystyle A_{n}\!\,}
generated by this algorithm.
A series of Given's Rotations matrices pre-multiplying
A
{\displaystyle A\!\,}
, a upper Heisenberg matrix, yield an upper triangular matrix
R
{\displaystyle R\!\,}
i.e.
G
(
n
−
1
,
n
)
⋯
G
(
2
,
3
)
G
(
1
,
2
)
A
=
R
{\displaystyle G_{(n-1,n)}\cdots G_{(2,3)}G_{(1,2)}A=R\!\,}
Since Givens Rotations matrices are each orthogonal, we can write
A
=
(
G
(
n
−
1
,
n
)
⋯
G
(
2
,
3
)
G
(
1
,
2
)
)
T
⏟
Q
R
{\displaystyle A=\underbrace {(G_{(n-1,n)}\cdots G_{(2,3)}G_{(1,2)})^{T}} _{Q}R\!\,}
i.e.
A
=
Q
R
{\displaystyle A=QR\!\,}
If we let
A
0
:=
A
{\displaystyle A_{0}:=A\!\,}
, we have
A
0
=
Q
0
R
0
{\displaystyle A_{0}=Q_{0}R_{0}\!\,}
,
or more generally for
k
=
1
,
2
,
3
,
…
{\displaystyle k=1,2,3,\ldots \!\,}
A
k
=
Q
k
R
k
{\displaystyle A_{k}=Q_{k}R_{k}\!\,}
.
In each case, the sequence of Given's Rotations matrices that compose
Q
{\displaystyle Q\!\,}
have the following structure
Q
=
G
(
1
,
2
)
T
G
2
,
3
T
⋯
G
(
n
−
1
,
n
)
T
=
(
∗
∗
0
0
⋯
0
∗
∗
0
0
⋯
0
0
0
1
0
⋯
0
0
0
0
1
⋯
0
⋮
⋮
⋮
⋱
⋱
⋮
0
0
⋯
⋯
0
1
)
(
1
0
0
0
⋯
0
0
∗
∗
0
⋯
0
0
∗
∗
0
⋯
0
0
0
0
1
⋯
0
⋮
⋮
⋮
⋱
⋱
⋮
0
0
⋯
⋯
0
1
)
⋯
(
1
0
0
0
⋯
0
0
1
0
0
⋯
0
⋮
⋱
⋱
⋱
⋱
0
0
0
0
1
0
0
0
0
0
0
∗
∗
0
0
0
0
∗
∗
)
=
(
∗
∗
∗
⋯
∗
∗
∗
∗
⋯
∗
0
∗
∗
⋯
∗
⋮
⋱
⋱
⋱
⋮
0
0
0
∗
∗
)
{\displaystyle {\begin{aligned}Q&=G_{(1,2)}^{T}G_{2,3}^{T}\cdots G_{(n-1,n)}^{T}\\&={\begin{pmatrix}*&*&0&0&\cdots &0\\\!\,*&*&0&0&\cdots &0\\0&0&1&0&\cdots &0\\0&0&0&1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&\cdots &\cdots &0&1\\\end{pmatrix}}{\begin{pmatrix}1&0&0&0&\cdots &0\\0&*&*&0&\cdots &0\\0&*&*&0&\cdots &0\\0&0&0&1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&\cdots &\cdots &0&1\\\end{pmatrix}}\cdots {\begin{pmatrix}1&0&0&0&\cdots &0\\0&1&0&0&\cdots &0\\\vdots &\ddots &\ddots &\ddots &\ddots &0\\0&0&0&1&0&0\\0&0&0&0&*&*\\0&0&0&0&*&*\\\end{pmatrix}}\\&={\begin{pmatrix}*&*&*&\cdots &*\\\!\,*&*&*&\cdots &*\\0&*&*&\cdots &*\\\vdots &\ddots &\ddots &\ddots &\vdots \\0&0&0&*&*\\\end{pmatrix}}\end{aligned}}}
So
Q
{\displaystyle Q\!\,}
is upper Hessenberg.
From the algorithm, we have
A
k
+
1
=
R
k
Q
k
=
(
∗
∗
∗
⋯
∗
0
∗
∗
⋯
∗
0
0
∗
⋯
∗
⋮
⋮
⋱
⋱
⋮
0
0
0
0
∗
)
(
∗
∗
∗
⋯
∗
∗
∗
∗
⋯
∗
0
∗
∗
⋯
∗
⋮
⋱
⋱
⋱
⋮
0
0
0
∗
∗
)
{\displaystyle {\begin{aligned}A_{k+1}&=R_{k}Q_{k}\\&={\begin{pmatrix}*&*&*&\cdots &*\\0&*&*&\cdots &*\\0&0&*&\cdots &*\\\vdots &\vdots &\ddots &\ddots &\vdots \\0&0&0&0&*\\\end{pmatrix}}{\begin{pmatrix}*&*&*&\cdots &*\\\!\,*&*&*&\cdots &*\\0&*&*&\cdots &*\\\vdots &\ddots &\ddots &\ddots &\vdots \\0&0&0&*&*\\\end{pmatrix}}\end{aligned}}}
We conclude
A
k
+
1
{\displaystyle A_{k+1}\!\,}
is upper Hessenberg because for
j
=
1
,
2
,
…
,
n
−
2
{\displaystyle j=1,2,\ldots ,n-2\!\,}
the
j
{\displaystyle j\!\,}
th column of
A
k
+
1
{\displaystyle A_{k+1}\!\,}
is a linear combination of the first
j
+
1
{\displaystyle j+1\!\,}
columns of
R
k
{\displaystyle R_{k}\!\,}
since
Q
k
{\displaystyle Q_{k}\!\,}
is also upper Hessenberg.
Let
(
3
3
1
5
)
{\displaystyle {\begin{pmatrix}3&3\\1&5\end{pmatrix}}}
For this
A
{\displaystyle A\!\,}
the sequence
{
A
n
}
{\displaystyle \{A_{n}\}\!\,}
has a limit. Find this limit. Give your reasoning.
The eigenvalues of
A
{\displaystyle A\!\,}
can be computed. They are
6
{\displaystyle 6\!\,}
and
2
{\displaystyle 2\!\,}
. Furthermore, the result of matrix multiplies in the algorithm shows that the diagonal difference,
(
a
1
,
2
−
a
2
,
1
)
{\displaystyle (a_{1,2}-a_{2,1})\!\,}
, is constant for all
i
{\displaystyle i\!\,}
.
Since the limit of
A
{\displaystyle A\!\,}
is an upper triangular matrix with the eigenvalues of
A
{\displaystyle A\!\,}
on the diagonal, the limit is
(
6
2
0
2
)
{\displaystyle {\begin{pmatrix}6&2\\0&2\end{pmatrix}}\!\,}
Let
A
∈
R
N
×
N
{\displaystyle A\in \mathbb {R} ^{N\times N}\!\,}
be symmetric and positive definite. Let
b
∈
R
N
{\displaystyle b\in \mathbb {R} ^{N}\!\,}
. Consider solving
A
x
=
b
{\displaystyle Ax=b\!\,}
using the conjugate gradient method. The
n
t
h
{\displaystyle n^{th}\!\,}
iterate
x
(
n
)
{\displaystyle x^{(n)}\!\,}
then satisfies
‖
x
(
n
)
−
x
‖
A
≤
‖
y
−
x
‖
A
{\displaystyle \|x^{(n)}-x\|_{A}\leq \|y-x\|_{A}\!\,}
for every
y
∈
x
(
0
)
+
K
n
(
r
(
0
)
,
A
)
{\displaystyle y\in x^{(0)}+{\mathcal {K}}_{n}(r^{(0)},A)\!\,}
,
where
‖
⋅
‖
A
{\displaystyle \|\cdot \|_{A}\!\,}
denotes the vector A-norm,
r
(
0
)
{\displaystyle r^{(0)}\!\,}
is the initial residual, and
K
n
(
r
(
0
)
,
A
)
=
span
{
r
(
0
)
,
A
r
(
0
)
,
…
,
A
n
−
1
r
(
0
)
}
{\displaystyle {\mathcal {K}}_{n}(r^{(0)},A)={\mbox{span}}\{r^{(0)},Ar^{(0)},\dots ,A^{n-1}r^{(0)}\}\!\,}
.
We know that
‖
x
(
n
)
−
x
‖
A
≤
‖
y
−
x
‖
A
{\displaystyle \|x^{(n)}-x\|_{A}\leq \|y-x\|_{A}\!\,}
for every
y
∈
x
(
0
)
+
K
n
(
r
(
0
)
,
A
)
{\displaystyle y\in x^{(0)}+{\mathcal {K}}_{n}(r^{(0)},A)\!\,}
, so if we can choose a
y
∈
x
(
0
)
+
K
n
(
r
(
0
)
,
A
)
{\displaystyle y\in x^{(0)}+{\mathcal {K}}_{n}(r^{(0)},A)\!\,}
such that
‖
x
(
n
)
−
x
‖
A
≤
‖
y
−
x
‖
A
≤
‖
p
n
(
A
)
‖
A
‖
x
(
0
)
−
x
‖
A
{\displaystyle \|x^{(n)}-x\|_{A}\leq \|y-x\|_{A}\leq \|p_{n}(A)\|_{A}\|x^{(0)}-x\|_{A}\!\,}
,
then we can solve part a.
First note from definition of
r
(
0
)
{\displaystyle r^{(0)}\!\,}
r
(
0
)
=
A
x
(
0
)
−
b
=
A
x
(
0
)
−
A
x
=
A
(
x
(
0
)
−
x
)
{\displaystyle {\begin{aligned}r^{(0)}&=Ax^{(0)}-b\\&=Ax^{(0)}-Ax\\&=A(x^{(0)}-x)\end{aligned}}}
Rewrite Krylov space
edit
Therefore, we can rewrite
K
n
(
r
(
0
)
,
A
)
{\displaystyle {\mathcal {K}}_{n}(r^{(0)},A)\!\,}
as follows:
K
n
(
r
(
0
)
,
A
)
=
K
n
(
A
(
x
(
0
)
−
x
)
,
A
)
=
span
{
A
(
x
(
0
)
−
x
)
,
A
2
(
x
(
0
)
−
x
)
,
…
,
A
n
(
x
(
0
)
−
x
)
}
{\displaystyle {\begin{aligned}{\mathcal {K}}_{n}(r^{(0)},A)&={\mathcal {K}}_{n}(A(x^{(0)}-x),A)\\&={\mbox{span}}\{A(x^{(0)}-x),A^{2}(x^{(0)}-x),\ldots ,A^{n}(x^{(0)}-x)\}\end{aligned}}}
We can then write
y
{\displaystyle y\!\,}
explicitly as follows:
y
∈
x
0
+
K
n
(
r
(
0
)
,
A
)
∈
x
0
+
K
n
(
A
(
x
(
0
)
−
x
)
,
A
)
=
x
0
+
α
1
A
(
x
(
0
)
−
x
)
+
α
2
A
2
(
x
(
0
)
−
x
)
+
…
+
α
n
A
n
(
x
(
0
)
−
x
)
for arbitrary real numbers
α
i
i
=
1
,
2
,
…
,
n
{\displaystyle {\begin{aligned}y&\in x_{0}+{\mathcal {K}}_{n}(r^{(0)},A)\\&\in x_{0}+{\mathcal {K}}_{n}(A(x^{(0)}-x),A)\\&=x_{0}+\alpha _{1}A(x^{(0)}-x)+\alpha _{2}A^{2}(x^{(0)}-x)+\ldots +\alpha _{n}A^{n}(x^{(0)}-x)\\&{\mbox{for arbitrary real numbers }}\alpha _{i}\quad i=1,2,\ldots ,n\end{aligned}}}
Substitute and Apply Inequality
edit
We substitute
y
{\displaystyle y\!\,}
into the hypothesis inequality and apply a norm inequality of matrix norms to get the desired result.
‖
x
(
n
)
−
x
‖
A
≤
‖
y
−
x
‖
A
=
‖
(
x
0
−
x
)
+
α
1
A
(
x
(
0
)
−
x
)
+
α
2
A
2
(
x
(
0
)
−
x
)
+
…
+
α
n
A
n
(
x
(
0
)
−
x
)
‖
A
=
‖
α
n
A
n
(
x
(
0
)
−
x
)
+
α
n
−
1
A
n
−
1
(
x
(
0
)
−
x
)
+
…
+
α
2
A
2
(
x
(
0
)
−
x
)
+
α
1
A
(
x
(
0
)
−
x
)
+
(
x
0
−
x
)
‖
A
=
‖
P
(
A
)
(
x
(
0
)
−
x
)
‖
A
≤
‖
P
(
A
)
‖
A
‖
x
(
0
)
−
x
‖
A
{\displaystyle {\begin{aligned}\|x^{(n)}-x\|_{A}&\leq \|y-x\|_{A}\\&=\|(x_{0}-x)+\alpha _{1}A(x^{(0)}-x)+\alpha _{2}A^{2}(x^{(0)}-x)+\ldots +\alpha _{n}A^{n}(x^{(0)}-x)\|_{A}\\&=\|\alpha _{n}A^{n}(x^{(0)}-x)+\alpha _{n-1}A^{n-1}(x^{(0)}-x)+\ldots +\alpha _{2}A^{2}(x^{(0)}-x)+\alpha _{1}A(x^{(0)}-x)+(x_{0}-x)\|_{A}\\&=\|P(A)(x^{(0)}-x)\|_{A}\\&\leq \|P(A)\|_{A}\|x^{(0)}-x\|_{A}\end{aligned}}}
Let
T
n
(
z
)
{\displaystyle T_{n}(z)\!\,}
denote the
n
t
h
{\displaystyle n^{th}\!\,}
Chebyshev polynomial. Let
λ
m
i
n
{\displaystyle \lambda _{min}\!\,}
and
λ
m
a
x
{\displaystyle \lambda _{max}\!\,}
denote respectively the smallest and largest eigenvalues of
A
{\displaystyle A\!\,}
. Apply the result of part (a) to
p
n
(
z
)
=
T
n
(
λ
m
a
x
+
λ
m
i
n
−
2
z
λ
m
a
x
−
λ
m
i
n
)
T
n
(
λ
m
a
x
+
λ
m
i
n
λ
m
a
x
−
λ
m
i
n
)
{\displaystyle p_{n}(z)={\frac {T_{n}({\frac {\lambda _{max}+\lambda _{min}-2z}{\lambda _{max}-\lambda _{min}}})}{T_{n}({\frac {\lambda _{max}+\lambda _{min}}{\lambda _{max}-\lambda _{min}}})}}}
to show that
‖
x
(
n
)
−
x
‖
A
≤
1
T
n
(
λ
m
a
x
+
λ
m
i
n
λ
m
a
x
−
λ
m
i
n
)
‖
x
(
0
)
−
x
‖
A
{\displaystyle \|x^{(n)}-x\|_{A}\leq {\frac {1}{T_{n}({\frac {\lambda _{max}+\lambda _{min}}{\lambda _{max}-\lambda _{min}}})}}\|x^{(0)}-x\|_{A}}
.
You can use without proof the fact that
‖
p
n
(
A
)
‖
A
=
max
{
|
p
n
(
z
)
|
:
z
∈
S
p
{
A
}
}
{\displaystyle \|p_{n}(A)\|_{A}=\max\{|p_{n}(z)|:z\in Sp\{A\}\}\!\,}
,
where
S
p
(
A
)
{\displaystyle Sp(A)\!\,}
denotes the set of eigenvalues of
A
{\displaystyle A\!\,}
, and the facts that for every
n
∈
N
{\displaystyle n\in \mathbb {N} \!\,}
the polynomial
T
n
(
z
)
{\displaystyle T_{n}(z)\!\,}
has degree
n
{\displaystyle n\!\,}
, is positive for
z
>
1
{\displaystyle z>1\!\,}
, and satisfies
T
n
(
cos
(
θ
)
)
=
cos
(
n
θ
)
for all
θ
∈
R
{\displaystyle T_{n}(\cos(\theta ))=\cos(n\theta )\quad \quad {\mbox{for all}}\;\theta \in \mathbb {R} \!\,}
.
We want to show that
‖
p
n
(
A
)
‖
A
=
1
T
n
(
λ
max
+
λ
min
λ
max
−
λ
min
)
{\displaystyle \|p_{n}(A)\|_{A}={\frac {1}{T_{n}({\frac {\lambda _{\max }+\lambda _{\min }}{\lambda _{\max }-\lambda _{\min }}})}}\!\,}
Maximize Numerator of p_n(z)
edit
By hypothesis,
‖
p
n
(
A
)
‖
A
=
max
{
|
p
n
(
z
)
|
:
z
∈
S
p
{
A
}
}
{\displaystyle \|p_{n}(A)\|_{A}=\max\{|p_{n}(z)|:z\in Sp\{A\}\}\!\,}
Since only the numerator of
p
n
(
z
)
{\displaystyle p_{n}(z)\!\,}
depends on
z
{\displaystyle z\!\,}
, we only need to maximize the numerator in order to maximize
|
p
n
(
z
)
|
{\displaystyle |p_{n}(z)|\!\,}
. That is find
z
:
max
z
∈
[
λ
min
,
λ
max
]
|
T
n
(
λ
max
+
λ
min
−
2
z
λ
max
−
λ
min
)
|
{\displaystyle z:\max _{z\in [\lambda _{\min },\lambda _{\max }]}\left|T_{n}\left({\frac {\lambda _{\max }+\lambda _{\min }-2z}{\lambda _{\max }-\lambda _{\min }}}\right)\right|\!\,}
Let
cos
(
θ
)
=
x
{\displaystyle \cos(\theta )=x\!\,}
. Then
θ
=
arccos
(
x
)
{\displaystyle \theta =\arccos(x)\!\,}
Hence,
T
n
(
cos
θ
)
=
T
n
(
x
)
=
cos
(
n
θ
)
=
cos
(
n
arccos
(
x
)
)
{\displaystyle {\begin{aligned}T_{n}(\cos \theta )&=T_{n}(x)\\&=\cos(n\theta )\\&=\cos(n\arccos(x))\end{aligned}}}
so
T
n
(
x
)
=
cos
(
n
arccos
(
x
)
)
{\displaystyle T_{n}(x)=\cos(n\arccos(x))\!\,}
Denote the argument of
T
n
{\displaystyle T_{n}\!\,}
as
x
(
z
)
{\displaystyle x(z)\!\,}
since the argument depends on
z
{\displaystyle z\!\,}
. Hence,
x
(
z
)
=
λ
max
+
λ
min
−
2
z
λ
max
−
λ
min
{\displaystyle x(z)={\frac {\lambda _{\max }+\lambda _{\min }-2z}{\lambda _{\max }-\lambda _{\min }}}\!\,}
,
Then,
x
(
λ
min
)
=
λ
max
−
λ
min
λ
max
−
λ
min
=
1
{\displaystyle x(\lambda _{\min })={\frac {\lambda _{\max }-\lambda _{\min }}{\lambda _{\max }-\lambda _{\min }}}=1\!\,}
x
(
λ
max
)
=
λ
min
−
λ
max
λ
max
−
λ
min
=
−
1
{\displaystyle x(\lambda _{\max })={\frac {\lambda _{\min }-\lambda _{\max }}{\lambda _{\max }-\lambda _{\min }}}=-1\!\,}
Thus
x
(
z
)
∈
[
−
1
,
1
]
{\displaystyle x(z)\in [-1,1]\!\,}
.
Now, since
arccos
(
x
)
{\displaystyle \;\arccos(x)\;}
is real for
x
∈
[
−
1
,
1
]
{\displaystyle x\in [-1,1]\!\,}
,
−
1
≤
cos
(
n
arccos
(
x
)
⏞
∈
R
)
⏟
T
n
(
x
)
≤
1
{\displaystyle -1\leq \underbrace {\cos(n\overbrace {\arccos(x)} ^{\in \mathbb {R} })} _{T_{n}(x)}\leq 1\!\,}
Hence,
max
x
∈
[
−
1
,
1
]
T
n
(
x
)
=
max
x
∈
[
−
1
,
1
]
|
cos
(
n
arccos
(
x
)
)
|
=
1
{\displaystyle \max _{x\in [-1,1]}T_{n}(x)=\max _{x\in [-1,1]}|\cos(n\arccos(x))|=1\!\,}
Let
z
=
λ
min
{\displaystyle z=\lambda _{\min }\!\,}
, then
|
T
n
(
λ
max
+
λ
min
−
2
(
λ
min
)
λ
max
−
λ
min
)
|
=
|
T
n
(
1
)
|
{\displaystyle \left|T_{n}\left({\frac {\lambda _{\max }+\lambda _{\min }-2(\lambda _{\min })}{\lambda _{\max }-\lambda _{\min }}}\right)\right|=|T_{n}(1)|\!\,}
Using our formula we have,
|
T
n
(
1
)
|
=
|
cos
(
n
⋅
arccos
(
1
)
)
|
=
|
cos
(
n
⋅
2
π
k
)
|
k
∈
Z
=
1
{\displaystyle {\begin{aligned}|T_{n}(1)|&=|\cos(n\cdot \arccos(1))|\\&=|\cos(n\cdot 2\pi k)|\quad k\in \mathbf {Z} \\&=1\end{aligned}}}
In other words, if
z
=
λ
min
{\displaystyle z=\lambda _{\min }\!\,}
,
T
n
{\displaystyle T_{n}\!\,}
achieves its maximum value of
1
{\displaystyle 1\!\,}
.