Given
f
∈
C
[
a
,
b
]
{\displaystyle f\in C[a,b]\!\,}
, let
p
n
∗
{\displaystyle p_{n}^{*}\!\,}
denote the best uniform approximation to
f
{\displaystyle f\!\,}
among polynomials of degree
≤
n
{\displaystyle \leq n\!\,}
, i.e.
max
x
∈
[
a
,
b
]
|
f
(
x
)
−
p
n
(
x
)
|
{\displaystyle \max _{x\in [a,b]}|f(x)-p_{n}(x)|\!\,}
is minimized by the choice
p
n
=
p
n
∗
{\displaystyle p_{n}=p_{n}^{*}\!\,}
. Let
e
(
x
)
=
f
(
x
)
−
p
n
∗
(
x
)
{\displaystyle e(x)=f(x)-p_{n}^{*}(x)\!\,}
. Prove that there are at least two points
x
1
,
x
2
∈
[
a
,
b
]
{\displaystyle x_{1},x_{2}\in [a,b]\!\,}
such that
|
e
(
x
1
)
|
=
|
e
(
x
2
)
|
=
max
x
∈
[
a
,
b
]
|
f
(
x
)
−
p
n
∗
(
x
)
|
{\displaystyle |e(x_{1})|=|e(x_{2})|=\max _{x\in [a,b]}|f(x)-p_{n}^{*}(x)|\!\,}
and
e
(
x
1
)
=
−
e
(
x
2
)
{\displaystyle e(x_{1})=-e(x_{2})\!\,}
Since both
f
(
x
)
{\displaystyle f(x)\!\,}
and
p
n
∗
(
x
)
{\displaystyle p_{n}^{*}(x)\!\,}
are continuous, so is the function
e
(
x
)
=
f
(
x
)
−
p
n
∗
(
x
)
{\displaystyle e(x)=f(x)-p_{n}^{*}(x)\!\,}
and therefore
e
(
x
)
{\displaystyle e(x)\!\,}
attains both a maximum and a minimum on
[
a
,
b
]
{\displaystyle [a,b]\!\,}
.
Let
M
=
max
x
∈
[
a
,
b
]
e
(
x
)
{\displaystyle M=\max _{x\in [a,b]}e(x)\!\,}
and
m
=
min
x
∈
[
a
,
b
]
e
(
x
)
{\displaystyle m=\min _{x\in [a,b]}e(x)\!\,}
If
M
≠
−
m
{\displaystyle M\neq -m\!\,}
, then there exists a polynomial
q
(
x
)
=
p
n
∗
(
x
)
+
M
+
m
2
{\displaystyle q(x)=p_{n}^{*}(x)+{\frac {M+m}{2}}\!\,}
(a vertical shift of the supposed best approximation
p
n
∗
(
x
)
)
{\displaystyle p_{n}^{*}(x))\!\,}
which is a better approximation to
f
(
x
)
{\displaystyle f(x)\!\,}
.
f
(
x
)
−
q
(
x
)
=
f
(
x
)
−
p
n
∗
(
x
)
−
M
+
m
2
≤
M
−
M
+
m
2
=
M
−
m
2
f
(
x
)
−
q
(
x
)
≥
m
−
M
+
m
2
=
m
−
M
2
{\displaystyle {\begin{aligned}f(x)-q(x)&=f(x)-p_{n}^{*}(x)-{\frac {M+m}{2}}\\&\leq M-{\frac {M+m}{2}}\\&={\frac {M-m}{2}}\\\\\\f(x)-q(x)&\geq m-{\frac {M+m}{2}}\\&={\frac {m-M}{2}}\end{aligned}}\!\,}
Therefore,
‖
f
(
x
)
−
q
(
x
)
‖
≤
M
−
m
2
≤
‖
f
−
p
n
∗
‖
=
M
{\displaystyle \|f(x)-q(x)\|\leq {\frac {M-m}{2}}\leq \|f-p_{n}^{*}\|=M\!\,}
Equality holds if and only if
m
=
−
M
{\displaystyle m=-M\!\,}
.
Find the node
x
1
{\displaystyle x_{1}\!\,}
and the weight
w
1
{\displaystyle w_{1}\!\,}
so that the integration rule
I
=
∫
0
1
x
−
1
2
f
(
x
)
d
x
≈
w
1
f
(
x
1
)
{\displaystyle I=\int _{0}^{1}x^{-{\frac {1}{2}}}f(x)dx\approx w_{1}f(x_{1})\!\,}
is exact for linear functions. (No knowledge of orthogonal polynomials is required.)
Let
f
(
x
)
=
1
{\displaystyle f(x)=1\!\,}
. Then
∫
0
1
1
x
1
2
=
w
1
⋅
1
{\displaystyle \int _{0}^{1}{\frac {1}{x^{\frac {1}{2}}}}=w_{1}\cdot 1\!\,}
which implies after computation
w
1
=
2
{\displaystyle w_{1}=2\!\,}
Let
f
(
x
)
=
x
{\displaystyle f(x)=x\!\,}
. Then
∫
0
1
x
x
1
2
=
w
1
⋅
x
1
=
2
x
1
{\displaystyle \int _{0}^{1}{\frac {x}{x^{\frac {1}{2}}}}=w_{1}\cdot x_{1}=2x_{1}\!\,}
which implies after computation
x
1
=
1
3
{\displaystyle x_{1}={\frac {1}{3}}\!\,}
Show that no one-point rule for approximating
I
{\displaystyle I\!\,}
can be exact for quadratics.
Suppose there is a one-point rule for approximating
I
{\displaystyle I\!\,}
that is exact for quadratics.
Let
f
(
x
)
=
x
2
{\displaystyle f(x)=x^{2}\!\,}
.
Then,
∫
0
1
x
2
x
1
2
=
2
5
{\displaystyle \int _{0}^{1}{\frac {x^{2}}{x^{\frac {1}{2}}}}={\frac {2}{5}}\!\,}
but
w
1
x
1
2
=
2
(
1
3
)
2
=
2
9
{\displaystyle w_{1}x_{1}^{2}=2({\frac {1}{3}})^{2}={\frac {2}{9}}\!\,}
,
a contradiction.
In fact
I
−
w
1
f
(
x
1
)
=
c
f
″
(
ξ
)
for some
ξ
∈
[
0
,
1
]
{\displaystyle I-w_{1}f(x_{1})=cf''(\xi ){\mbox{ for some }}\xi \in [0,1]\!\,}
Find
c
{\displaystyle c\!\,}
.
Let
f
(
x
)
=
x
2
{\displaystyle f(x)=x^{2}\!\,}
. Then
f
″
(
x
)
=
2
{\displaystyle f''(x)=2\!\,}
Using the values computed in part (b), we have
2
5
−
2
9
=
c
⋅
2
{\displaystyle {\frac {2}{5}}-{\frac {2}{9}}=c\cdot 2\!\,}
which implies
c
=
4
45
{\displaystyle c={\frac {4}{45}}\!\,}
Let
f
(
x
)
{\displaystyle f(x)\!\,}
and
g
(
x
)
{\displaystyle g(x)\!\,}
be two polynomials of degree
≤
3
{\displaystyle \leq 3\!\,}
. Suppose
f
{\displaystyle f\!\,}
and
g
{\displaystyle g\!\,}
agree at
x
=
a
{\displaystyle x=a\!\,}
,
x
=
b
{\displaystyle x=b\!\,}
, and
x
=
(
a
+
b
)
/
2
{\displaystyle x=(a+b)/2\!\,}
. Show
∫
a
b
f
(
x
)
d
x
=
∫
a
b
g
(
x
)
d
x
{\displaystyle \int _{a}^{b}f(x)dx=\int _{a}^{b}g(x)dx\!\,}
There exist
w
1
,
w
2
,
w
3
{\displaystyle w_{1},w_{2},w_{3}\!\,}
such that if
f
(
x
)
{\displaystyle f(x)\!\,}
is a polynomial of degree
≤
3
{\displaystyle \leq 3\!\,}
∫
a
b
f
(
x
)
d
x
=
w
1
f
(
a
)
+
w
2
f
(
b
)
+
w
3
f
(
a
+
b
2
)
{\displaystyle \int _{a}^{b}f(x)dx=w_{1}f(a)+w_{2}f(b)+w_{3}f({\frac {a+b}{2}})\!\,}
Hence,
∫
a
b
f
(
x
)
d
x
=
w
1
f
(
a
)
+
w
2
f
(
b
)
+
w
3
f
(
a
+
b
2
)
=
w
1
g
(
a
)
+
w
2
g
(
b
)
+
w
3
g
(
a
+
b
2
)
=
∫
a
b
g
(
x
)
d
x
{\displaystyle {\begin{aligned}\int _{a}^{b}f(x)dx&=w_{1}f(a)+w_{2}f(b)+w_{3}f({\frac {a+b}{2}})\\&=w_{1}g(a)+w_{2}g(b)+w_{3}g({\frac {a+b}{2}})\\&=\int _{a}^{b}g(x)dx\end{aligned}}\!\,}
Convergence of any starting vector
edit
Using the iteration
x
k
+
1
=
x
k
+
α
(
b
−
A
x
k
)
{\displaystyle \,\!x_{k+1}=x_{k}+\alpha (b-Ax_{k})}
, we have that
e
k
+
1
=
x
k
+
1
−
x
∗
=
x
k
−
x
∗
+
α
(
b
−
A
x
k
)
=
x
k
−
x
∗
+
α
(
A
x
∗
−
A
x
k
)
=
e
k
−
α
A
e
k
=
(
I
−
α
A
)
e
k
{\displaystyle {\begin{aligned}e_{k+1}&=x_{k+1}-x^{*}\\&=x_{k}-x^{*}+\alpha (b-Ax_{k})\\&=x_{k}-x^{*}+\alpha (Ax^{*}-Ax_{k})\\&=e_{k}-\alpha Ae_{k}\\&=(I-\alpha A)e_{k}\end{aligned}}}
Then, computing norms we obtain
‖
e
k
+
1
‖
≤
‖
I
−
α
A
‖
‖
e
k
‖
{\displaystyle \,\!\|e_{k+1}\|\leq \|I-\alpha A\|\|e_{k}\|}
.
In particular, considering
‖
⋅
‖
2
{\displaystyle \,\!\|\cdot \|_{2}}
, we have that
‖
e
k
+
1
‖
2
≤
‖
I
−
α
A
‖
2
‖
e
k
‖
2
{\displaystyle \,\!\|e_{k+1}\|_{2}\leq \|I-\alpha A\|_{2}\|e_{k}\|_{2}}
.
Now, in order to study the convergence of the method, we need to analyze
‖
I
−
α
A
‖
2
{\displaystyle \,\!\|I-\alpha A\|_{2}}
. We use the following characterization:
‖
I
−
α
A
‖
2
2
=
ρ
(
(
I
−
α
A
)
(
I
−
α
A
)
∗
)
=
ρ
(
I
−
α
A
∗
−
α
A
+
|
α
|
2
A
A
∗
)
{\displaystyle {\begin{aligned}\|I-\alpha A\|_{2}^{2}&=\rho \left((I-\alpha A)(I-\alpha A)^{*}\right)\\&=\rho \left(I-\alpha A^{*}-\alpha A+|\alpha |^{2}AA^{*}\right)\end{aligned}}}
Let's denote by
λ
{\displaystyle \,\!\lambda }
an eigenvalue of the matrix
A
{\displaystyle \,\!A}
. Then
ρ
(
(
I
−
α
A
)
(
I
−
α
A
)
∗
)
<
1
{\displaystyle \rho \left((I-\alpha A)(I-\alpha A)^{*}\right)<1\,\!}
implies
1
−
2
α
R
e
(
λ
)
+
α
2
|
λ
|
2
<
1
{\displaystyle 1-2\alpha Re(\lambda )+\alpha ^{2}|\lambda |^{2}<1\!\,}
i.e.
−
2
α
R
e
(
λ
)
+
α
2
|
λ
|
2
<
0
{\displaystyle -2\alpha Re(\lambda )+\alpha ^{2}|\lambda |^{2}<0\!\,}
The above equation is quadratic with respect to
α
{\displaystyle \alpha \!\,}
and opens upward. Hence using the quadratic formula we can find the roots of the equation to be
α
1
=
0
α
2
=
2
R
e
(
λ
)
|
λ
|
2
{\displaystyle \alpha _{1}=0\quad \alpha _{2}=2{\frac {Re(\lambda )}{|\lambda |^{2}}}\!\,}
Then, if
R
e
(
λ
)
>
0
{\displaystyle \,\!Re(\lambda )>0}
for all the eigenvalues of the matrix
A
{\displaystyle \,\!A}
, we can find
α
∈
(
0
,
2
R
e
(
λ
)
|
λ
|
2
)
{\displaystyle \,\!\alpha \in (0,2{\frac {Re(\lambda )}{|\lambda |^{2}}})}
such that
ρ
(
(
I
−
α
A
)
(
I
−
α
A
)
∗
)
<
1
{\displaystyle \,\!\rho \left((I-\alpha A)(I-\alpha A)^{*}\right)<1}
, i.e., the method converges.
Choosing alpha if A symmetric
edit
On the other hand, if the matrix
A
{\displaystyle \,\!A}
is symmetric, we have that
‖
I
−
α
A
‖
2
=
ρ
(
I
−
α
A
)
{\displaystyle \,\!\|I-\alpha A\|_{2}=\rho \left(I-\alpha A\right)}
. Then using the Schur decomposition, we can find a unitary matrix
U
{\displaystyle \,\!U}
and a diagonal matrix
Λ
{\displaystyle \,\!\Lambda }
such that
A
=
U
∗
Λ
U
{\displaystyle \,\!A=U^{*}\Lambda U}
, and in consequence,
‖
I
−
α
A
‖
2
=
‖
I
−
α
Λ
‖
2
=
ρ
(
I
−
α
Λ
)
=
max
i
=
1
,
…
,
n
(
1
−
α
λ
i
)
{\displaystyle {\begin{aligned}\|I-\alpha A\|_{2}&=\|I-\alpha \Lambda \|_{2}\\&=\rho (I-\alpha \Lambda )\\&=\max _{i=1,\dots ,n}(1-\alpha \lambda _{i})\\\end{aligned}}}
This last expression is minimized if
(
1
−
α
λ
1
)
=
−
(
1
−
α
λ
n
)
{\displaystyle \,\!(1-\alpha \lambda _{1})=-(1-\alpha \lambda _{n})}
, i.e, if
α
o
p
t
=
2
/
(
λ
1
+
λ
n
)
{\displaystyle \,\!\alpha _{opt}=2/(\lambda _{1}+\lambda _{n})}
. With this optimal value of
α
{\displaystyle \,\!\alpha }
, we obtain that
(
I
−
α
o
p
t
Λ
)
i
,
i
=
1
−
2
λ
1
+
λ
n
λ
i
,
{\displaystyle \,\!(I-\alpha _{opt}\Lambda )_{i,i}=1-{\frac {2}{\lambda _{1}+\lambda _{n}}}\lambda _{i},}
which implies that
‖
I
−
α
o
p
t
Λ
‖
2
=
λ
n
−
λ
1
λ
1
+
λ
n
{\displaystyle \,\!\|I-\alpha _{opt}\Lambda \|_{2}={\frac {\lambda _{n}-\lambda _{1}}{\lambda _{1}+\lambda _{n}}}}
.
Finally, we've obtained that
‖
e
k
+
1
‖
2
≤
λ
n
−
λ
1
λ
1
+
λ
n
‖
e
k
‖
2
{\displaystyle \,\!\|e_{k+1}\|_{2}\leq {\frac {\lambda _{n}-\lambda _{1}}{\lambda _{1}+\lambda _{n}}}\|e_{k}\|_{2}}
.
Show that if some eigenvalues of
A
{\displaystyle A\!\,}
have negative real part and some have positive real part, then there is no real
α
{\displaystyle \alpha \!\,}
for which the iterations converge.
If
R
e
(
λ
i
)
>
0
{\displaystyle Re(\lambda _{i})>0\!\,}
for
i
=
1
,
2
,
…
,
n
{\displaystyle i=1,2,\ldots ,n\!\,}
, then convergence occurs if
α
∈
(
0
,
2
R
e
(
λ
)
|
λ
|
2
)
{\displaystyle \alpha \in (0,2{\frac {Re(\lambda )}{|\lambda |^{2}}})\!\,}
Hence
α
>
0
{\displaystyle \alpha >0\!\,}
.
Similarly, if
R
e
(
λ
i
)
<
0
{\displaystyle Re(\lambda _{i})<0\!\,}
for
i
=
1
,
2
,
…
,
n
{\displaystyle i=1,2,\ldots ,n\!\,}
, then convergence occurs if
α
∈
(
2
R
e
(
λ
)
|
λ
|
2
,
0
)
{\displaystyle \alpha \in (2{\frac {Re(\lambda )}{|\lambda |^{2}}},0)\!\,}
Hence
α
<
0
{\displaystyle \alpha <0\!\,}
.
If some eigenvalues are positive and some negative, there does not exist a real
α
{\displaystyle \alpha \!\,}
for which the iterations converge since
α
{\displaystyle \alpha \!\,}
cannot be both positive and negative simultaneously.
Let
p
=
‖
I
−
α
A
‖
<
1
{\displaystyle p=\|I-\alpha A\|<1\!\,}
for a matrix norm associated to a vector norm. Show that the error can be expressed in terms of the difference between consecutive iterates, namely
‖
x
−
x
k
+
1
‖
≤
ρ
1
−
ρ
‖
x
k
−
x
k
+
1
‖
{\displaystyle \|x-x_{k+1}\|\leq {\frac {\rho }{1-\rho }}\|x_{k}-x_{k+1}\|\!\,}
(The proof of this is short but a little tricky)
‖
x
−
x
k
+
1
‖
≤
p
‖
x
−
x
k
+
1
+
x
k
+
1
−
x
k
‖
≤
p
‖
x
−
x
k
+
1
‖
+
p
‖
x
k
+
1
−
x
k
‖
(
1
−
p
)
‖
x
−
x
k
+
1
‖
≤
p
‖
x
k
+
1
−
x
k
‖
‖
x
−
x
k
+
1
‖
≤
p
1
−
p
‖
x
k
+
1
−
x
k
‖
{\displaystyle {\begin{aligned}\|x-x_{k+1}\|&\leq p\|x-x_{k+1}+x_{k+1}-x_{k}\|\\&\leq p\|x-x_{k+1}\|+p\|x_{k+1}-x_{k}\|\\(1-p)\|x-x_{k+1}\|&\leq p\|x_{k+1}-x_{k}\|\\\|x-x_{k+1}\|&\leq {\frac {p}{1-p}}\|x_{k+1}-x_{k}\|\end{aligned}}\!\,}
Let
A
{\displaystyle A\!\,}
be the tridiagonal matrix
(
3
1
0
0
⋯
0
−
1
3
1
0
⋯
0
⋮
⋱
⋱
⋱
⋮
⋮
⋮
⋮
⋱
⋱
⋱
⋮
⋮
⋮
⋮
⋱
3
1
0
⋯
⋯
⋯
−
1
3
)
{\displaystyle {\begin{pmatrix}3&1&0&0&\cdots &0\\-1&3&1&0&\cdots &0\\\vdots &\ddots &\ddots &\ddots &\vdots &\vdots \\\vdots &\vdots &\ddots &\ddots &\ddots &\vdots \\\vdots &\vdots &\vdots &\ddots &3&1\\0&\cdots &\cdots &\cdots &-1&3\end{pmatrix}}\!\,}
Find a value of
α
{\displaystyle \alpha \!\,}
that guarantees convergence
By Gerschgorin's theorem, the magnitude of all the eigenvalues are between 1 and 5 i.e.
1
<
|
λ
|
<
5
{\displaystyle 1<|\lambda |<5\!\,}
Therefore,
ρ
(
I
−
α
A
)
=
max
i
(
|
1
−
α
λ
i
|
)
≤
|
1
−
α
5
|
{\displaystyle \rho (I-\alpha A)=\max _{i}(|1-\alpha \lambda _{i}|)\leq |1-\alpha 5|\!\,}
We want
ρ
(
I
−
α
A
)
<
|
1
−
α
5
|
<
1
{\displaystyle \rho (I-\alpha A)<|1-\alpha 5|<1\!\,}
Therefore,
0
<
α
<
2
5
{\displaystyle 0<\alpha <{\frac {2}{5}}\!\,}