A set of functions
{
g
1
,
…
,
g
n
}
⊂
C
[
a
,
b
]
{\displaystyle \{g_{1},\ldots ,g_{n}\}\subset C[a,b]\!\,}
is a Chebyshev system if
(i) The set is linearly independent.
(ii) If
g
{\displaystyle g\!\,}
is a linear combination of
{
g
1
,
…
,
g
n
}
{\displaystyle \{g_{1},\ldots ,g_{n}\}\!\,}
which is not identically zero,
g
{\displaystyle g\!\,}
has at most
n
−
1
{\displaystyle n-1\!\,}
distinct zeros in
[
a
,
b
]
{\displaystyle [a,b]\!\,}
.
Show that
{
g
1
,
…
,
g
n
}
{\displaystyle \{g_{1},\ldots ,g_{n}\}\!\,}
is a Chebyshev system if and only if for any
n
{\displaystyle n\!\,}
distinct points
x
1
,
…
,
x
n
∈
[
a
,
b
]
{\displaystyle x_{1},\ldots ,x_{n}\in [a,b]\!\,}
the matrix
A
{\displaystyle A\!\,}
with
a
i
,
j
=
g
j
(
x
i
)
,
1
≤
i
,
j
≤
n
{\displaystyle a_{i,j}=g_{j}(x_{i}),1\leq i,j\leq n\!\,}
is non-singular.
We want to prove the following statement:
If
{
g
1
,
…
,
g
n
}
{\displaystyle \{g_{1},\ldots ,g_{n}\}\!\,}
is a Chebyshev system, then for any
n
{\displaystyle n\!\,}
distinct points
x
1
,
…
,
x
n
∈
[
a
,
b
]
{\displaystyle x_{1},\ldots ,x_{n}\in [a,b]\!\,}
the matrix
A
{\displaystyle A\!\,}
with
a
i
,
j
=
g
j
(
x
i
)
,
1
≤
i
,
j
≤
n
{\displaystyle a_{i,j}=g_{j}(x_{i}),1\leq i,j\leq n\!\,}
is non-singular.
Writing out the matrix
A
{\displaystyle A\!\,}
yields:
A
=
(
g
1
(
x
1
)
g
2
(
x
1
)
⋯
g
n
(
x
1
)
g
1
(
x
2
)
g
2
(
x
2
)
⋯
g
n
(
x
2
)
⋮
⋮
⋱
⋮
g
1
(
x
n
)
g
2
(
x
n
)
⋯
g
n
(
x
n
)
)
{\displaystyle A={\begin{pmatrix}g_{1}(x_{1})&g_{2}(x_{1})&\cdots &g_{n}(x_{1})\\g_{1}(x_{2})&g_{2}(x_{2})&\cdots &g_{n}(x_{2})\\\vdots &\vdots &\ddots &\vdots \\g_{1}(x_{n})&g_{2}(x_{n})&\cdots &g_{n}(x_{n})\end{pmatrix}}}
Since the set
{
g
1
,
…
,
g
n
}
{\displaystyle \{g_{1},\ldots ,g_{n}\}\!\,}
is linearly independent, there does not exist any non-zero sets of constants of
α
i
{\displaystyle \alpha _{i}\!\,}
such that
∑
i
=
1
n
α
i
g
i
(
x
)
=
0
{\displaystyle \sum _{i=1}^{n}\alpha _{i}g_{i}(x)=0\!\,}
for any
x
∈
[
a
,
b
]
{\displaystyle x\in [a,b]\!\,}
. Hence, the columns of the matrix
A
{\displaystyle A\!\,}
are linearly independent which implies that
A
{\displaystyle A\!\,}
is non-singular.
Assume
A
{\displaystyle A\!\,}
is non-singular.
This implies the columns of
A
=
(
g
1
(
x
1
)
g
2
(
x
1
)
⋯
g
n
(
x
1
)
g
1
(
x
2
)
g
2
(
x
2
)
⋯
g
n
(
x
2
)
⋮
⋮
⋱
⋮
g
1
(
x
n
)
g
2
(
x
n
)
⋯
g
n
(
x
n
)
)
{\displaystyle A={\begin{pmatrix}g_{1}(x_{1})&g_{2}(x_{1})&\cdots &g_{n}(x_{1})\\g_{1}(x_{2})&g_{2}(x_{2})&\cdots &g_{n}(x_{2})\\\vdots &\vdots &\ddots &\vdots \\g_{1}(x_{n})&g_{2}(x_{n})&\cdots &g_{n}(x_{n})\end{pmatrix}}}
are linearly independent. Since
A
{\displaystyle A\!\,}
is non-singular for any choice of
{
x
1
,
x
2
,
…
,
x
n
}
{\displaystyle \{x_{1},x_{2},\ldots ,x_{n}\}\!\,}
,
{
g
1
,
g
2
,
…
,
g
n
}
{\displaystyle \{g_{1},g_{2},\dots ,g_{n}\}\!\,}
is a linearly independent set and we have shown
(
i
)
{\displaystyle (i)\!\,}
.
By hypothesis,
g
(
x
)
{\displaystyle g(x)\!\,}
is a linear combination of
{
g
1
,
g
2
,
…
,
g
n
}
{\displaystyle \{g_{1},g_{2},\dots ,g_{n}\}\!\,}
i.e.
g
(
x
)
=
α
1
g
1
(
x
)
+
α
2
g
2
(
x
)
+
⋯
+
α
n
g
n
(
x
)
{\displaystyle g(x)=\alpha _{1}g_{1}(x)+\alpha _{2}g_{2}(x)+\cdots +\alpha _{n}g_{n}(x)\!\,}
for
α
i
{\displaystyle \alpha _{i}\!\,}
not all zero.
Assume for the sake of contradiction that
g
(
x
)
{\displaystyle g(x)\!\,}
has
n
{\displaystyle n\!\,}
zeros at
x
1
^
,
x
2
^
,
…
,
x
n
^
{\displaystyle {\hat {x_{1}}},{\hat {x_{2}}},\ldots ,{\hat {x_{n}}}\!\,}
This implies the following
n
{\displaystyle n\!\,}
equations:
g
(
x
1
^
)
=
0
=
α
1
g
1
(
x
1
^
)
+
α
2
g
2
(
x
1
^
)
+
⋯
+
α
n
g
n
(
x
1
^
)
g
(
x
2
^
)
=
0
=
α
1
g
1
(
x
2
^
)
+
α
2
g
2
(
x
2
^
)
+
⋯
+
α
n
g
n
(
x
2
^
)
⋮
g
(
x
n
^
)
=
0
=
α
1
g
1
(
x
n
^
)
+
α
2
g
2
(
x
n
^
)
+
⋯
+
α
n
g
n
(
x
n
^
)
{\displaystyle {\begin{aligned}g({\hat {x_{1}}})&=0=\alpha _{1}g_{1}({\hat {x_{1}}})+\alpha _{2}g_{2}({\hat {x_{1}}})+\cdots +\alpha _{n}g_{n}({\hat {x_{1}}})\\g({\hat {x_{2}}})&=0=\alpha _{1}g_{1}({\hat {x_{2}}})+\alpha _{2}g_{2}({\hat {x_{2}}})+\cdots +\alpha _{n}g_{n}({\hat {x_{2}}})\\&\vdots \\g({\hat {x_{n}}})&=0=\alpha _{1}g_{1}({\hat {x_{n}}})+\alpha _{2}g_{2}({\hat {x_{n}}})+\cdots +\alpha _{n}g_{n}({\hat {x_{n}}})\end{aligned}}}
Rewriting the equations in matrix form yields
(
g
1
(
x
1
^
)
g
2
(
x
1
^
)
⋯
g
n
(
x
1
^
)
g
1
(
x
2
^
)
g
2
(
x
2
^
)
⋯
g
n
(
x
2
^
)
⋮
⋮
⋱
⋮
g
1
(
x
n
^
)
g
2
(
x
n
^
)
⋯
g
n
(
x
n
^
)
)
(
α
1
α
2
⋮
α
n
)
=
(
0
0
⋮
0
)
{\displaystyle {\begin{pmatrix}g_{1}({\hat {x_{1}}})&g_{2}({\hat {x_{1}}})&\cdots &g_{n}({\hat {x_{1}}})\\g_{1}({\hat {x_{2}}})&g_{2}({\hat {x_{2}}})&\cdots &g_{n}({\hat {x_{2}}})\\\vdots &\vdots &\ddots &\vdots \\g_{1}({\hat {x_{n}}})&g_{2}({\hat {x_{n}}})&\cdots &g_{n}({\hat {x_{n}}})\end{pmatrix}}{\begin{pmatrix}\alpha _{1}\\\alpha _{2}\\\vdots \\\alpha _{n}\end{pmatrix}}={\begin{pmatrix}0\\0\\\vdots \\0\end{pmatrix}}}
Since
α
i
{\displaystyle \alpha _{i}\!\,}
are not all zero, this implies that the columns of
A
{\displaystyle A\!\,}
,
{
g
1
,
g
2
,
…
,
g
n
}
{\displaystyle \{g_{1},g_{2},\ldots ,g_{n}\}}
, are linearly dependent, a contradiction.
Hence,
g
{\displaystyle g\!\,}
has at most
n
−
1
{\displaystyle n-1\!\,}
zeros, and we have shown
(
i
i
)
{\displaystyle (ii)\!\,}
.
Let
f
∈
C
m
+
1
[
a
,
b
]
{\displaystyle f\in C^{m+1}[a,b]\!\,}
be such that
f
(
m
+
1
)
(
x
)
≠
0
{\displaystyle f^{(m+1)}(x)\neq 0\!\,}
for all
x
∈
[
a
,
b
]
{\displaystyle x\in [a,b]\!\,}
. Let
g
j
(
x
)
=
x
j
−
1
,
j
=
1
,
…
,
m
+
1
{\displaystyle g_{j}(x)=x^{j-1},j=1,\ldots ,m+1\!\,}
. Show that
{
g
1
,
…
,
g
m
+
1
,
f
}
{\displaystyle \{g_{1},\ldots ,g_{m+1},f\}\!\,}
is a Chebyshev system. For this, you may use results from polynomial interpolation without proof.
We have to prove:
(i)
{
1
,
x
,
x
2
,
…
,
f
}
{\displaystyle \{1,x,x^{2},\dots ,f\}\!\,}
is a linearly independent set
(ii)any linear combination of this set has at most
m
{\displaystyle m\!\,}
zeros.
If we evaluate
g
1
,
g
2
,
…
,
g
m
+
1
{\displaystyle g_{1},g_{2},\dots ,g_{m+1}\!\,}
at
m
+
1
{\displaystyle m+1\!\,}
distinct points,
{
x
1
,
x
2
,
…
,
x
m
+
1
}
{\displaystyle \{x_{1},x_{2},\ldots ,x_{m+1}\}}
, we have the following Van Der Monde matrix whose determinant is non-zero.
(
1
1
⋯
1
x
1
x
1
2
⋯
x
1
m
⋮
⋮
⋱
⋮
x
m
+
1
⋯
⋯
x
m
+
1
m
)
{\displaystyle {\begin{pmatrix}1&1&\cdots &1\\x_{1}&x_{1}^{2}&\cdots &x_{1}^{m}\\\vdots &\vdots &\ddots &\vdots \\x_{m+1}&\cdots &\cdots &x_{m+1}^{m}\end{pmatrix}}}
Hence,
g
1
,
g
2
,
…
,
g
m
+
1
{\displaystyle g_{1},g_{2},\dots ,g_{m+1}\!\,}
are linearly independent.
Assume for the sake of contradiction that
f
{\displaystyle f\!\,}
is a linear combination of
g
1
,
g
2
,
…
,
g
m
+
1
{\displaystyle g_{1},g_{2},\dots ,g_{m+1}\!\,}
, that is
f
(
x
)
=
β
0
g
1
(
x
)
+
β
1
g
2
(
x
)
+
⋯
+
β
m
g
m
+
1
(
x
)
=
β
0
⋅
1
+
β
1
x
+
β
2
x
2
+
⋯
+
β
m
x
m
{\displaystyle {\begin{aligned}f(x)&=\beta _{0}g_{1}(x)+\beta _{1}g_{2}(x)+\dots +\beta _{m}g_{m+1}(x)\\&=\beta _{0}\cdot 1+\beta _{1}x+\beta _{2}x^{2}+\dots +\beta _{m}x^{m}\end{aligned}}}
.
Hence,
f
(
x
)
{\displaystyle f(x)\!\,}
is a polynomial of degree
m
{\displaystyle m\!\,}
. Taking
m
+
1
{\displaystyle m+1\!\,}
derivatives of
f
(
x
)
{\displaystyle f(x)\!\,}
yields
f
(
m
+
1
)
(
x
)
=
0
{\displaystyle f^{(m+1)}(x)=0\!\,}
which contradicts the hypothesis. Therefore
{
1
,
x
,
x
2
,
…
,
f
}
{\displaystyle \{1,x,x^{2},\dots ,f\}\!\,}
is a linearly independent set.
Let
h
(
x
)
=
β
0
g
1
(
x
)
+
β
1
g
2
(
x
)
+
⋯
+
β
m
g
m
+
1
(
x
)
+
β
m
+
1
f
(
x
)
{\displaystyle h(x)=\beta _{0}g_{1}(x)+\beta _{1}g_{2}(x)+\dots +\beta _{m}g_{m+1}(x)+\beta _{m+1}f(x)\!\,}
.
Suppose
h
{\displaystyle h\!\,}
has
m
+
2
{\displaystyle m+2\!\,}
(or more) zeros in
[
a
,
b
]
{\displaystyle [a,b]\!\,}
. By Rolle's Theorem, the (n+1)st derivative of f vanishes on the given interval, a contradiction.
(i) and (ii) show that
{
1
,
x
,
x
2
,
…
,
f
}
{\displaystyle \{1,x,x^{2},\dots ,f\}\!\,}
is a Chebyshev system.
Let
I
n
(
f
)
=
∑
k
=
1
n
w
n
,
k
f
(
x
n
,
k
)
,
a
≤
x
n
,
k
≤
b
(
0
)
{\displaystyle I_{n}(f)=\sum _{k=1}^{n}w_{n,k}f(x_{n},k),\quad \quad a\leq x_{n,k}\leq b\quad \quad (0)}
be a sequence of integration rules.
Suppose
lim
n
→
∞
I
n
(
x
k
)
=
∫
a
b
x
k
d
x
,
k
=
0
,
1
,
…
(
1
)
{\displaystyle \lim _{n\rightarrow \infty }I_{n}(x^{k})=\int _{a}^{b}x^{k}dx,\quad \quad k=0,1,\ldots \quad \quad (1)}
and
∑
k
=
1
n
|
w
n
,
k
|
≤
M
,
n
=
1
,
2
,
…
(
2
)
{\displaystyle \sum _{k=1}^{n}|w_{n,k}|\leq M,\quad \quad n=1,2,\ldots \quad \quad (2)}
for some constant
M
{\displaystyle M\!\,}
. Show that
lim
n
→
∞
I
n
(
f
)
=
∫
a
b
f
(
x
)
d
x
{\displaystyle \lim _{n\rightarrow \infty }I_{n}(f)=\int _{a}^{b}f(x)dx}
for all
f
∈
C
[
a
,
b
]
{\displaystyle f\in C[a,b]\!\,}
By the Weierstrass Approximation Theorem , given
ϵ
>
0
{\displaystyle \epsilon >0\!\,}
, there exists a polynomial
p
(
x
)
{\displaystyle p(x)\!\,}
such that
max
a
≤
x
≤
b
|
f
(
x
)
−
p
(
x
)
|
≤
min
{
ϵ
2
⋅
1
M
,
ϵ
2
⋅
1
b
−
a
}
(
3
)
{\displaystyle \max _{a\leq x\leq b}|f(x)-p(x)|\leq \min\{{\frac {\epsilon }{2}}\cdot {\frac {1}{M}},{\frac {\epsilon }{2}}\cdot {\frac {1}{b-a}}\}\quad \quad (3)}
Let
I
(
f
)
=
∫
a
b
f
(
x
)
d
x
{\displaystyle I(f)=\int _{a}^{b}f(x)dx}
Adding and subtracting
I
(
p
)
{\displaystyle I(p)\!\,}
and
I
n
(
p
)
{\displaystyle I_{n}(p)\!\,}
, yields,
I
(
f
)
−
I
n
(
f
)
=
[
I
(
f
)
−
I
(
p
)
]
+
[
I
(
p
)
−
I
n
(
p
)
]
+
[
I
n
(
p
)
−
I
n
(
f
)
]
{\displaystyle I(f)-I_{n}(f)=[I(f)-I(p)]+[I(p)-I_{n}(p)]+[I_{n}(p)-I_{n}(f)]\!\,}
By the triangle inequality and equation (2) and (3),
|
I
(
f
)
−
I
n
(
f
)
|
≤
|
I
(
f
−
p
)
|
+
|
I
(
p
)
−
I
n
(
p
)
|
+
|
I
n
(
p
−
f
)
|
≤
∫
a
b
|
f
(
x
)
−
p
(
x
)
|
d
x
+
|
I
(
p
)
−
I
n
(
p
)
|
+
∑
k
=
1
n
|
w
n
,
k
|
|
f
(
x
n
,
k
)
−
p
(
x
n
,
k
)
|
≤
ϵ
2
(
b
−
a
)
∫
a
b
d
x
+
|
I
(
p
)
−
I
n
(
p
)
|
+
ϵ
2
M
∑
k
=
1
n
|
w
n
,
k
|
≤
ϵ
2
(
b
−
a
)
(
b
−
a
)
+
|
I
(
p
)
−
I
n
(
p
)
|
+
ϵ
2
M
M
=
ϵ
+
|
I
(
p
)
−
I
n
(
p
)
|
{\displaystyle {\begin{aligned}|I(f)-I_{n}(f)|&\leq |I(f-p)|+|I(p)-I_{n}(p)|+|I_{n}(p-f)|\\&\leq \int _{a}^{b}|f(x)-p(x)|dx+|I(p)-I_{n}(p)|+\sum _{k=1}^{n}|w_{n,k}||f(x_{n},k)-p(x_{n},k)|\\&\leq {\frac {\epsilon }{2(b-a)}}\int _{a}^{b}dx+|I(p)-I_{n}(p)|+{\frac {\epsilon }{2M}}\sum _{k=1}^{n}|w_{n,k}|\\&\leq {\frac {\epsilon }{2(b-a)}}(b-a)+|I(p)-I_{n}(p)|+{\frac {\epsilon }{2M}}M\\&=\epsilon +|I(p)-I_{n}(p)|\end{aligned}}}
By equation (1), when
n
→
∞
{\displaystyle n\rightarrow \infty \!\,}
,
|
I
(
p
)
−
I
n
(
p
)
|
=
0
{\displaystyle |I(p)-I_{n}(p)|=0\!\,}
Hence for arbitrary small
ϵ
>
0
{\displaystyle \epsilon >0\!\,}
as
n
→
∞
{\displaystyle n\rightarrow \infty \!\,}
,
|
I
(
f
)
−
I
n
(
f
)
|
≤
ϵ
{\displaystyle |I(f)-I_{n}(f)|\leq \epsilon }
i.e.
I
(
f
)
=
lim
n
→
∞
I
n
(
f
)
{\displaystyle I(f)=\lim _{n\rightarrow \infty }I_{n}(f)\!\,}
Show that if all
w
n
,
k
>
0
{\displaystyle w_{n,k}>0\!\,}
then (1) implies (2).
For
k
=
0
{\displaystyle k=0\!\,}
, equation (1) yields,
lim
n
→
∞
I
n
(
x
0
)
=
∫
a
b
x
0
d
x
lim
n
→
∞
I
n
(
1
)
=
∫
a
b
1
⋅
d
x
=
(
b
−
a
)
{\displaystyle {\begin{aligned}\lim _{n\rightarrow \infty }I_{n}(x^{0})&=\int _{a}^{b}x^{0}dx\\\lim _{n\rightarrow \infty }I_{n}(1)&=\int _{a}^{b}1\cdot dx\\&=(b-a)\end{aligned}}}
Letting
f
(
x
)
{\displaystyle f(x)\!\,}
in equation (0) yields,
I
n
(
1
)
=
∑
k
=
1
n
w
n
,
k
⋅
1
=
∑
k
=
1
n
w
n
,
k
{\displaystyle I_{n}(1)=\sum _{k=1}^{n}w_{n,k}\cdot 1=\sum _{k=1}^{n}w_{n,k}}
Combining the two above results yields,
lim
n
→
∞
I
n
(
1
)
=
lim
n
→
∞
∑
k
=
1
n
w
n
,
k
=
(
b
−
a
)
≤
M
{\displaystyle {\begin{aligned}\lim _{n\rightarrow \infty }I_{n}(1)&=\lim _{n\rightarrow \infty }\sum _{k=1}^{n}w_{n,k}\\&=(b-a)\\&\leq M\end{aligned}}}
Since
(
b
−
a
)
{\displaystyle (b-a)\!\,}
is finite, there exists some number
M
{\displaystyle M\!\,}
such that
(
b
−
a
)
≤
M
{\displaystyle (b-a)\leq M}
.
Since
w
n
,
k
>
0
{\displaystyle w_{n,k}>0\!\,}
,
lim
n
→
∞
∑
k
=
1
n
|
w
n
,
k
|
≤
M
{\displaystyle \lim _{n\rightarrow \infty }\sum _{k=1}^{n}|w_{n,k}|\leq M}
i.e. equation (2).
Consider the real system of linear equations
A
x
=
b
(
1
)
{\displaystyle Ax=b\quad \quad (1)\,\!}
where
A
{\displaystyle A\,\!}
is non singular and satisfies
(
v
,
A
v
)
>
0
{\displaystyle (v,Av)>0\,\!}
for all real
v
{\displaystyle v\,\!}
, where the Euclidean inner product is used here.
Substituting for
M
{\displaystyle M\!\,}
in
(
v
,
M
v
)
{\displaystyle (v,Mv)\!\,}
and expanding the inner product we have,
(
v
,
M
v
)
=
(
v
,
1
2
(
A
+
A
T
)
v
)
=
(
v
,
1
2
A
v
+
1
2
A
T
v
)
=
(
v
,
1
2
A
v
)
+
(
v
,
1
2
A
T
v
)
=
1
2
(
v
,
A
v
)
+
1
2
(
v
,
A
T
v
)
{\displaystyle {\begin{aligned}(v,Mv)&=(v,{\frac {1}{2}}(A+A^{T})v)\\&=(v,{\frac {1}{2}}Av+{\frac {1}{2}}A^{T}v)\\&=(v,{\frac {1}{2}}Av)+(v,{\frac {1}{2}}A^{T}v)\\&={\frac {1}{2}}(v,Av)+{\frac {1}{2}}(v,A^{T}v)\end{aligned}}}
From properties of inner products we have,
(
v
,
A
T
v
)
=
(
A
v
,
v
)
=
(
v
,
A
v
)
{\displaystyle (v,A^{T}v)=(Av,v)=(v,Av)\!\,}
Hence,
(
v
,
M
v
)
=
1
2
(
v
,
A
v
)
+
1
2
(
v
,
A
T
v
)
=
1
2
(
v
,
A
v
)
+
1
2
(
v
,
A
v
)
=
(
v
,
A
v
)
{\displaystyle {\begin{aligned}(v,Mv)&={\frac {1}{2}}(v,Av)+{\frac {1}{2}}(v,A^{T}v)\\&={\frac {1}{2}}(v,Av)+{\frac {1}{2}}(v,Av)\\&=(v,Av)\end{aligned}}}
Prove that
(
v
,
A
v
)
(
v
,
v
)
≥
λ
min
(
M
)
>
0
{\displaystyle {\frac {(v,Av)}{(v,v)}}\geq \lambda _{\min }(M)>0\,\!}
where
λ
min
(
M
)
{\displaystyle \lambda _{\min }(M)\,\!}
is the minimum eigenvalue of
M
{\displaystyle M\,\!}
.
(
v
,
A
v
)
(
v
,
v
)
=
(
v
,
M
v
)
(
v
,
v
)
=
v
T
M
v
v
T
v
{\displaystyle {\frac {(v,Av)}{(v,v)}}={\frac {(v,Mv)}{(v,v)}}={\frac {v^{T}Mv}{v^{T}v}}}
Since
M
{\displaystyle M\,\!}
is symmetric, it has a eigenvalue decomposition
M
=
Q
T
Λ
Q
{\displaystyle M=Q^{T}\Lambda Q\!\,}
where
Q
{\displaystyle Q\,\!}
is orthogonal and
Λ
{\displaystyle \Lambda \,\!}
is a diagonal matrix containing all the eigenvalues.
Substitution yields,
(
v
,
A
v
)
(
v
,
v
)
=
v
T
Q
T
Λ
Q
v
v
T
v
{\displaystyle {\frac {(v,Av)}{(v,v)}}={\frac {v^{T}Q^{T}\Lambda Qv}{v^{T}v}}}
Let
Q
v
=
x
{\displaystyle Qv=x\!\,}
This implies the following three relationships:
v
=
Q
T
x
v
T
Q
T
=
x
T
v
T
=
x
T
Q
{\displaystyle {\begin{aligned}v&=Q^{T}x\\v^{T}Q^{T}&=x^{T}\\v^{T}&=x^{T}Q\end{aligned}}}
Substituting,
v
T
Q
T
Λ
Q
v
v
T
v
=
x
T
Λ
x
x
T
Q
Q
T
x
=
x
T
Λ
x
x
T
x
{\displaystyle {\begin{aligned}{\frac {v^{T}Q^{T}\Lambda Qv}{v^{T}v}}&={\frac {x^{T}\Lambda x}{x^{T}QQ^{T}x}}\\&={\frac {x^{T}\Lambda x}{x^{T}x}}\end{aligned}}}
Expanding the numerator we have,
x
T
Λ
x
=
[
x
1
x
2
…
x
n
]
[
λ
1
λ
2
⋱
λ
n
]
[
x
1
x
2
⋮
x
n
]
=
[
x
1
x
2
…
x
n
]
[
λ
1
x
1
λ
2
x
2
⋮
λ
n
x
n
]
=
λ
1
x
1
2
+
λ
2
x
2
+
…
+
λ
n
2
x
n
2
=
∑
i
=
1
n
λ
i
x
i
2
{\displaystyle {\begin{aligned}x^{T}\Lambda x&={\begin{bmatrix}x_{1}x_{2}\ldots x_{n}\end{bmatrix}}{\begin{bmatrix}\lambda _{1}&&&\\&\lambda _{2}&&\\&&\ddots &\\&&&\lambda _{n}\end{bmatrix}}{\begin{bmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{bmatrix}}\\&={\begin{bmatrix}x_{1}x_{2}\ldots x_{n}\end{bmatrix}}{\begin{bmatrix}\lambda _{1}x_{1}\\\lambda _{2}x_{2}\\\vdots \\\lambda _{n}x_{n}\\\end{bmatrix}}\\&=\lambda _{1}x_{1}^{2}+\lambda _{2}x^{2}+\ldots +\lambda _{n}^{2}x_{n}^{2}\\&=\sum _{i=1}^{n}\lambda _{i}x_{i}^{2}\end{aligned}}}
Expanding the denominator yield
x
T
x
=
∑
i
=
1
n
x
i
2
{\displaystyle x^{T}x=\sum _{i=1}^{n}x_{i}^{2}}
Substituting,
x
T
Λ
x
x
T
x
=
∑
i
=
1
n
λ
i
x
i
2
∑
i
=
1
n
x
i
2
≥
λ
min
(
M
)
∑
i
=
1
n
x
i
2
∑
i
=
1
n
x
i
2
=
λ
min
(
M
)
{\displaystyle {\begin{aligned}{\frac {x^{T}\Lambda x}{x^{T}x}}&={\frac {\sum _{i=1}^{n}\lambda _{i}x_{i}^{2}}{\sum _{i=1}^{n}x_{i}^{2}}}\\&\geq \lambda _{\min }(M){\frac {\sum _{i=1}^{n}x_{i}^{2}}{\sum _{i=1}^{n}x_{i}^{2}}}\\&=\lambda _{\min }(M)\end{aligned}}}
From part(a)
(
v
,
A
v
)
=
(
v
,
M
v
)
>
0
{\displaystyle (v,Av)=(v,Mv)>0\!\,}
for all real
v
{\displaystyle v\!\,}
.
Therefore
M
{\displaystyle M\!\,}
is positive definite which implies all its eigenvalues are positive. In particular,
λ
min
(
M
)
>
0
{\displaystyle \lambda _{\min }(M)>0\!\,}
Now consider the iteration for computing a series of approximate solutions to (1),
x
k
+
1
=
x
k
+
α
r
k
{\displaystyle x_{k+1}=x_{k}+\alpha r_{k}\!\,}
where
r
k
=
b
−
A
x
k
{\displaystyle r_{k}=b-Ax_{k}\!\,}
and
α
{\displaystyle \alpha \!\,}
is chosen to minimize
‖
r
k
+
1
‖
2
{\displaystyle \|r_{k+1}\|_{2}\!\,}
as a function of
α
{\displaystyle \alpha \!\,}
. Prove that
‖
r
k
+
1
‖
2
‖
r
k
‖
2
≤
(
1
−
λ
min
(
M
)
2
λ
max
(
A
T
A
)
)
1
2
{\displaystyle {\frac {\|r_{k+1}\|_{2}}{\|r_{k}\|_{2}}}\leq \left(1-{\frac {\lambda _{\min }(M)^{2}}{\lambda _{\max }(A^{T}A)}}\right)^{\frac {1}{2}}\!\,}
First, we want to write
‖
r
k
+
1
‖
2
{\displaystyle \|r_{k+1}\|^{2}}
as a function of
α
{\displaystyle \alpha \!\,}
i.e.
f
(
α
)
=
‖
r
k
+
1
‖
2
{\displaystyle f(\alpha )=\|r_{k+1}\|^{2}\!\,}
Changing the indices of equation
r
k
{\displaystyle r_{k}\!\,}
from
k
{\displaystyle k\!\,}
to
k
+
1
{\displaystyle k+1\!\,}
,substituting for
b
−
A
x
k
{\displaystyle b-Ax_{k}\!\,}
, and applying the definition of norm yields,
‖
r
k
+
1
‖
2
=
‖
b
−
A
x
k
+
1
‖
2
=
‖
b
−
A
x
k
−
α
A
r
k
‖
2
=
‖
r
k
−
α
A
r
k
‖
2
=
(
r
k
−
α
A
r
k
,
r
k
−
α
A
r
k
)
=
(
r
k
,
r
k
)
−
α
(
A
r
k
,
r
k
)
−
α
(
r
k
,
A
r
k
)
+
α
2
(
A
r
k
,
A
r
k
)
{\displaystyle {\begin{aligned}\|r_{k+1}\|^{2}&=\|b-Ax_{k+1}\|^{2}\\&=\|b-Ax_{k}-\alpha Ar_{k}\|^{2}\\&=\|r_{k}-\alpha Ar_{k}\|^{2}\\&=(r_{k}-\alpha Ar_{k},r_{k}-\alpha Ar_{k})\\&=(r_{k},r_{k})-\alpha (Ar_{k},r_{k})-\alpha (r_{k},Ar_{k})+\alpha ^{2}(Ar_{k},Ar_{k})\end{aligned}}}
From a property of inner products and since
A
{\displaystyle A\!\,}
is symmetric,
(
r
k
,
A
r
k
)
=
(
A
T
r
k
,
r
k
)
=
(
A
r
k
,
r
k
)
{\displaystyle (r_{k},Ar_{k})=(A^{T}r_{k},r_{k})=(Ar_{k},r_{k})\!\,}
Hence,
f
(
α
)
=
‖
r
k
+
1
‖
2
=
(
r
k
,
r
k
)
−
2
α
(
A
r
k
,
r
k
)
+
α
2
(
A
r
k
,
A
r
k
)
{\displaystyle f(\alpha )=\|r_{k+1}\|^{2}=(r_{k},r_{k})-2\alpha (Ar_{k},r_{k})+\alpha ^{2}(Ar_{k},Ar_{k})\!\,}
Next we want to minimize
f
(
α
)
{\displaystyle f(\alpha )\!\,}
as a function of
α
{\displaystyle \alpha \!\,}
. Taking its derivative with respect to
α
{\displaystyle \alpha \!\,}
yields,
f
′
(
α
)
=
−
2
(
A
r
k
,
r
k
)
+
2
α
(
A
r
k
,
A
r
k
)
{\displaystyle f^{\prime }(\alpha )=-2(Ar_{k},r_{k})+2\alpha (Ar_{k},Ar_{k})\!\,}
Setting
f
′
(
α
)
=
0
{\displaystyle f^{\prime }(\alpha )=0}
and solving for
α
{\displaystyle \alpha \!\,}
gives
0
=
−
2
(
A
r
k
,
r
k
)
+
2
α
(
A
r
k
,
A
r
k
)
0
=
−
(
A
r
k
,
r
k
)
+
α
(
A
r
k
,
A
r
k
)
(
A
r
k
,
r
k
)
=
α
(
A
r
k
,
A
r
k
)
α
=
(
A
r
k
,
r
k
)
(
A
r
k
,
A
r
k
)
{\displaystyle {\begin{aligned}0&=-2(Ar_{k},r_{k})+2\alpha (Ar_{k},Ar_{k})\\0&=-(Ar_{k},r_{k})+\alpha (Ar_{k},Ar_{k})\\(Ar_{k},r_{k})&=\alpha (Ar_{k},Ar_{k})\\\alpha &={\frac {(Ar_{k},r_{k})}{(Ar_{k},Ar_{k})}}\end{aligned}}}
Substituting for
α
{\displaystyle \alpha \!\,}
into
‖
r
k
+
1
‖
2
{\displaystyle \|r_{k+1}\|^{2}\!\,}
gives,
‖
r
k
+
1
‖
2
=
(
r
k
,
r
k
)
−
2
α
(
A
r
k
,
r
k
)
+
α
2
(
A
r
k
,
A
r
k
)
=
(
r
k
,
r
k
)
−
2
(
A
r
k
,
r
k
)
(
A
r
k
,
A
r
k
)
(
A
r
k
,
r
k
)
+
(
A
r
k
,
r
k
)
2
(
A
r
k
,
A
r
k
)
2
(
A
r
k
,
A
r
k
)
=
(
r
k
,
r
k
)
−
2
(
A
r
k
,
r
k
)
2
(
A
r
k
,
A
r
k
)
+
(
A
r
k
,
r
k
)
2
(
A
r
k
,
A
r
k
)
=
(
r
k
,
r
k
)
−
(
A
r
k
,
r
k
)
2
(
A
r
k
,
A
r
k
)
{\displaystyle {\begin{aligned}\|r_{k+1}\|^{2}&=(r_{k},r_{k})-2\alpha (Ar_{k},r_{k})+\alpha ^{2}(Ar_{k},Ar_{k})\\&=(r_{k},r_{k})-2{\frac {(Ar_{k},r_{k})}{(Ar_{k},Ar_{k})}}(Ar_{k},r_{k})+{\frac {(Ar_{k},r_{k})^{2}}{(Ar_{k},Ar_{k})^{2}}}(Ar_{k},Ar_{k})\\&=(r_{k},r_{k})-2{\frac {(Ar_{k},r_{k})^{2}}{(Ar_{k},Ar_{k})}}+{\frac {(Ar_{k},r_{k})^{2}}{(Ar_{k},Ar_{k})}}\\&=(r_{k},r_{k})-{\frac {(Ar_{k},r_{k})^{2}}{(Ar_{k},Ar_{k})}}\end{aligned}}}
By definition of norm,
‖
r
k
‖
2
=
(
r
k
,
r
k
)
{\displaystyle \|r_{k}\|^{2}=(r_{k},r_{k})\!\,}
Hence
‖
r
k
+
1
‖
2
‖
r
k
‖
2
=
1
−
(
A
r
k
,
r
k
)
2
(
r
k
,
r
k
)
(
A
r
k
,
A
r
k
)
{\displaystyle {\frac {\|r_{k+1}\|^{2}}{\|r_{k}\|^{2}}}=1-{\frac {(Ar_{k},r_{k})^{2}}{(r_{k},r_{k})(Ar_{k},Ar_{k})}}\!\,}
Multiplying and dividing the second term on the right hand side of the above equation by
(
r
k
,
r
k
)
{\displaystyle (r_{k},r_{k})\!\,}
and applying a property of inner product yields,
‖
r
k
+
1
‖
2
‖
r
k
‖
2
=
1
−
(
A
r
k
,
r
k
)
2
(
r
k
,
r
k
)
(
r
k
,
r
k
)
2
(
r
k
,
A
T
A
r
k
)
{\displaystyle {\frac {\|r_{k+1}\|^{2}}{\|r_{k}\|^{2}}}=1-{\frac {(Ar_{k},r_{k})^{2}(r_{k},r_{k})}{(r_{k},r_{k})^{2}(r_{k},A^{T}Ar_{k})}}\!\,}
From the result of part (b), we have our desired result:
‖
r
k
+
1
‖
2
‖
r
k
‖
2
≤
(
1
−
λ
min
(
M
)
2
λ
max
(
A
T
A
)
)
1
2
{\displaystyle {\frac {\|r_{k+1}\|_{2}}{\|r_{k}\|_{2}}}\leq \left(1-{\frac {\lambda _{\min }(M)^{2}}{\lambda _{\max }(A^{T}A)}}\right)^{\frac {1}{2}}\!\,}