Let
f
{\displaystyle f\!\,}
be continuous on
[
0
,
1
]
{\displaystyle [0,1]\!\,}
. A polynomial
p
{\displaystyle p\!\,}
of degree not greater than
n
{\displaystyle n\!\,}
is said to be a best (or Chebyshev) approximation to
f
{\displaystyle f\!\,}
if
p
{\displaystyle p\!\,}
minimizes the expression
E
(
p
)
=
max
x
∈
[
0
,
1
]
|
f
(
x
)
−
p
(
x
)
|
{\displaystyle E(p)=\max _{x\in [0,1]}|f(x)-p(x)|\!\,}
Show that a sufficient condition for
p
{\displaystyle p\!\,}
to be a best approximation is that there exists points
x
0
<
x
1
<
⋯
<
x
n
+
1
in
[
0
,
1
]
{\displaystyle x_{0}<x_{1}<\cdots <x_{n+1}\;{\mbox{in}}\;[0,1]\!\,}
such that
f
(
x
i
)
−
p
(
x
i
)
=
−
[
f
(
x
i
+
1
)
−
p
(
x
i
+
1
)
]
=
±
E
(
p
)
,
i
=
0
,
1
,
2
,
…
,
n
{\displaystyle f(x_{i})-p(x_{i})=-[f(x_{i+1})-p(x_{i+1})]=\pm E(p),\qquad i=0,1,2,\dots ,n\!\,}
.
Assume there exists
q
∈
Π
n
{\displaystyle q\in \Pi ^{n}\!\,}
such that
E
(
q
)
<
E
(
p
)
{\displaystyle E(q)<E(p)\!\,}
Then for
i
=
0
,
1
,
…
,
n
{\displaystyle i=0,1,\ldots ,n\!\,}
|
f
(
x
i
)
−
q
(
x
i
)
|
≤
|
f
(
x
i
)
−
p
(
x
i
)
|
{\displaystyle |f(x_{i})-q(x_{i})|\leq |f(x_{i})-p(x_{i})|\!\,}
Let
e
(
q
)
=
f
(
x
)
−
q
(
x
)
{\displaystyle e(q)=f(x)-q(x)\!\,}
and
e
(
p
)
=
f
(
x
)
−
p
(
x
)
{\displaystyle e(p)=f(x)-p(x)\!\,}
.
Then
e
≡
e
(
q
(
x
)
)
−
e
(
p
(
x
)
)
{\displaystyle e\equiv e(q(x))-e(p(x))\!\,}
takes on the sign of
e
(
p
)
{\displaystyle e(p)\!\,}
since
|
e
(
p
(
x
i
)
)
|
≥
|
e
(
q
(
x
i
)
)
|
{\displaystyle |e(p(x_{i}))|\geq |e(q(x_{i}))|\!\,}
Since
e
(
p
)
{\displaystyle e(p)\!\,}
changes signs
n
+
1
{\displaystyle n+1\!\,}
times (by hypothesis),
e
{\displaystyle e\!\,}
has
n
+
1
{\displaystyle n+1\!\,}
zeros.
However
e
=
p
−
q
{\displaystyle e=p-q\!\,}
and thus can only have at most
n
{\displaystyle n\!\,}
zeros. Therefore
e
≡
0
{\displaystyle e\equiv 0\!\,}
and
p
=
q
{\displaystyle p=q\!\,}
Compute the best linear approximation to
x
2
in
[
0
,
1
]
{\displaystyle x^{2}\;{\mbox{in}}\;[0,1]\!\,}
. [Hint: Drawing a line through the parabola will allow you to conjecture where two points of oscillation must lie. Use conditions from (a) to determine the third point and coefficients of
p
{\displaystyle p\!\,}
.]
First we need to find the roots of
T
2
(
x
)
{\displaystyle T_{2}(x)\!\,}
in [0,1], which are given by
x
k
=
1
2
+
1
2
cos
(
2
k
−
1
4
π
)
{\displaystyle x_{k}={\frac {1}{2}}+{\frac {1}{2}}\cos({\frac {2k-1}{4}}\pi )\!\,}
So our points at which to interpolate are
x
1
=
1
2
+
1
2
cos
(
π
4
)
{\displaystyle x_{1}={\frac {1}{2}}+{\frac {1}{2}}\cos({\frac {\pi }{4}})\!\,}
x
2
=
1
2
+
1
2
cos
(
3
π
4
)
=
1
2
−
1
2
cos
(
π
4
)
{\displaystyle x_{2}={\frac {1}{2}}+{\frac {1}{2}}\cos({\frac {3\pi }{4}})={\frac {1}{2}}-{\frac {1}{2}}\cos({\frac {\pi }{4}})\!\,}
Our linear interpolant passes through the points
(
x
1
,
x
1
2
)
{\displaystyle (x_{1},x_{1}^{2})\!\,}
and
(
x
2
,
x
2
2
)
{\displaystyle (x_{2},x_{2}^{2})\!\,}
, which using point-slope form gives the equation
y
−
x
1
2
=
x
2
2
−
x
1
2
x
2
−
x
1
(
x
−
x
1
)
{\displaystyle y-x_{1}^{2}={\frac {x_{2}^{2}-x_{1}^{2}}{x_{2}-x_{1}}}(x-x_{1})\!\,}
or
y
=
x
−
1
8
{\displaystyle y=x-{\frac {1}{8}}\!\,}
We will be concerned with the least squares problem of minimizing
ρ
2
(
x
)
=
‖
b
−
A
x
‖
2
{\displaystyle \rho ^{2}(x)=\|b-Ax\|^{2}\!\,}
.
Here
A
{\displaystyle A\!\,}
is an
m
×
n
{\displaystyle m\times n\!\,}
matrix of rank
n
{\displaystyle n\!\,}
(which implies
m
≥
n
{\displaystyle m\geq n\!\,}
) and
‖
⋅
‖
{\displaystyle \|\cdot \|\!\,}
is the Euclidean vector norm. Let
A
=
(
Q
1
Q
2
)
(
R
0
)
{\displaystyle A={\begin{pmatrix}Q_{1}&Q_{2}\end{pmatrix}}{\begin{pmatrix}R\\0\end{pmatrix}}\!\,}
be the QR decomposition of
A
{\displaystyle A\!\,}
. Here
Q
1
,
Q
2
and
R
{\displaystyle Q_{1},Q_{2}\;{\mbox{and}}\;R\!\,}
are respectively
m
×
n
,
m
×
(
m
−
n
)
,
and
n
×
n
{\displaystyle m\times n,m\times (m-n),\;{\mbox{and}}\;n\times n\!\,}
.
Show that the solution of the least squares problem satisfies the QR equation
R
x
=
Q
1
T
b
{\displaystyle Rx=Q_{1}^{T}b\!\,}
and that the solution is unique. Further show that
ρ
(
x
)
=
‖
Q
2
T
b
‖
{\displaystyle \rho (x)=\|Q_{2}^{T}b\|\!\,}
.
The two problems are equivalent
edit
First notice
A
=
(
Q
1
Q
2
)
(
R
0
)
=
Q
1
R
{\displaystyle A={\begin{pmatrix}Q_{1}Q_{2}\end{pmatrix}}{\begin{pmatrix}R\\0\end{pmatrix}}=Q_{1}R}
Then we can write
‖
b
−
A
x
‖
=
‖
b
−
Q
1
R
x
‖
=
‖
Q
1
T
b
−
Q
1
T
Q
1
R
x
‖
=
‖
Q
1
T
b
−
R
x
‖
{\displaystyle {\begin{aligned}\|b-Ax\|&=\|b-Q_{1}Rx\|\\&=\|Q_{1}^{T}b-Q_{1}^{T}Q_{1}Rx\|\\&=\|Q_{1}^{T}b-Rx\|\end{aligned}}}
Note that multiplying by orthogonal matrices does not affect the norm.
Then solving
min
x
‖
b
−
A
x
‖
2
{\displaystyle \min _{x}\|b-Ax\|^{2}\!\,}
is equivalent to solving
min
x
‖
Q
1
T
b
−
R
x
‖
2
{\displaystyle \min _{x}\|Q_{1}^{T}b-Rx\|^{2}\!\,}
, which is equivalent to solving
R
x
=
Q
1
T
b
{\displaystyle Rx=Q_{1}^{T}b\!\,}
. Note that a solution exists and is unique since
R
{\displaystyle R\!\,}
is n-by-n and non-singular.
Show that
ρ
(
x
)
=
‖
Q
2
T
b
‖
{\displaystyle \rho (x)=\|Q_{2}^{T}b\|}
edit
Similarly
‖
b
−
A
x
‖
=
‖
b
−
Q
1
R
x
‖
=
‖
Q
2
T
b
−
Q
2
T
Q
1
⏟
0
R
x
‖
=
‖
Q
2
T
b
‖
{\displaystyle {\begin{aligned}\|b-Ax\|&=\|b-Q_{1}Rx\|\\&=\|Q_{2}^{T}b-\underbrace {Q_{2}^{T}Q_{1}} _{0}Rx\|\\&=\|Q_{2}^{T}b\|\end{aligned}}}
Then
ρ
2
(
x
)
=
‖
b
−
A
x
‖
2
=
‖
Q
2
T
b
‖
2
{\displaystyle \rho ^{2}(x)=\|b-Ax\|^{2}=\|Q_{2}^{T}b\|^{2}}
, or simply
ρ
(
x
)
=
‖
Q
2
T
b
‖
{\displaystyle \rho (x)=\|Q_{2}^{T}b\|}
, as desired.
Use the QR equation to show that the least squares solution satisfies the normal equations
(
A
T
A
)
x
=
A
T
b
{\displaystyle (A^{T}A)x=A^{T}b\!\,}
.
A
T
A
x
=
(
Q
1
R
)
T
Q
1
R
x
=
R
T
Q
1
T
Q
1
R
x
=
R
T
R
x
A
T
b
=
(
Q
1
R
)
T
b
=
R
T
Q
1
T
b
⏟
R
x
=
R
T
R
x
{\displaystyle {\begin{aligned}A^{T}Ax&=(Q_{1}R)^{T}Q_{1}Rx\\&=R^{T}Q_{1}^{T}Q_{1}Rx\\&=R^{T}Rx\\\\A^{T}b&=(Q_{1}R)^{T}b\\&=R^{T}\underbrace {Q_{1}^{T}b} _{Rx}\\&=R^{T}Rx\end{aligned}}\!\,}
{\displaystyle \!\,}
Let
A
{\displaystyle \!\,A}
be real symmetric and let
u
0
≠
0
{\displaystyle u_{0}\neq 0\!\,}
be given. For
k
=
1
,
2
,
…
{\displaystyle k=1,2,\ldots \!\,}
, define
u
k
+
1
{\displaystyle u_{k+1}\!\,}
as the linear combination of the vectors
A
u
k
,
u
k
,
…
,
u
0
{\displaystyle Au_{k},u_{k},\ldots ,u_{0}\!\,}
with the coefficient of
A
u
k
{\displaystyle Au_{k}\!\,}
equal to one and orthogonal to the vectors
u
k
,
…
,
u
0
{\displaystyle u_{k},\ldots ,u_{0}\!\,}
; i.e.
u
k
+
1
=
A
u
k
−
α
k
u
k
−
β
k
u
k
−
1
−
γ
k
u
k
−
2
−
⋯
{\displaystyle u_{k+1}=Au_{k}-\alpha _{k}u_{k}-\beta _{k}u_{k-1}-\gamma _{k}u_{k-2}-\cdots \!\,}
Find formulas for
α
k
{\displaystyle \alpha _{k}\!\,}
and
β
k
{\displaystyle \beta _{k}\!\,}
Using Gram Schmidit, we have
α
k
=
(
A
u
k
,
u
k
)
‖
u
k
‖
2
β
k
=
(
A
u
k
,
u
k
−
1
)
‖
u
k
−
1
‖
2
{\displaystyle {\begin{aligned}\alpha _{k}&={\frac {(Au_{k},u_{k})}{\|u_{k}\|^{2}}}\\\beta _{k}&={\frac {(Au_{k},u_{k-1})}{\|u_{k-1}\|^{2}}}\end{aligned}}}
Show that
γ
k
=
δ
k
=
…
=
0
{\displaystyle \gamma _{k}=\delta _{k}=\ldots =0\!\,}
Where do you use the symmetry of
A
{\displaystyle A\!\,}
?
Since
γ
k
=
(
A
u
k
,
u
k
−
2
)
‖
u
k
−
2
‖
2
{\displaystyle \gamma _{k}={\frac {(Au_{k},u_{k-2})}{\|u_{k-2}\|^{2}}}}
, if
γ
k
=
0
{\displaystyle \gamma _{k}=0\!\,}
, then
(
A
u
k
,
u
k
−
2
)
=
0
{\displaystyle (Au_{k},u_{k-2})=0\!\,}
Since
A
{\displaystyle A\!\,}
is symmetric,
(
A
u
k
,
u
k
−
2
)
=
(
u
k
,
A
T
u
k
−
2
)
=
(
u
k
,
A
u
k
−
2
)
{\displaystyle (Au_{k},u_{k-2})=(u_{k},A^{T}u_{k-2})=(u_{k},Au_{k-2})\!\,}
From hypothesis,
(
u
i
,
u
j
)
=
{
1
if
i
=
j
0
if
i
≠
j
{\displaystyle (u_{i},u_{j})={\begin{cases}1&{\mbox{if }}i=j\\0&{\mbox{if }}i\neq j\end{cases}}}
Also from hypothesis,
A
u
k
=
u
k
+
1
+
α
k
u
k
+
β
k
u
k
−
1
+
γ
k
u
k
−
2
+
…
A
u
k
−
2
=
u
k
−
1
+
α
k
−
2
u
k
−
2
+
β
k
−
2
u
k
−
3
+
γ
k
−
2
u
k
−
4
+
…
{\displaystyle {\begin{aligned}Au_{k}&=u_{k+1}+\alpha _{k}u_{k}+\beta _{k}u_{k-1}+\gamma _{k}u_{k-2}+\ldots \\Au_{k-2}&=u_{k-1}+\alpha _{k-2}u_{k-2}+\beta _{k-2}u_{k-3}+\gamma _{k-2}u_{k-4}+\ldots \end{aligned}}}
Using the above results we have,
(
A
u
k
,
u
k
−
2
)
=
(
u
k
,
A
u
k
−
2
)
=
(
u
k
,
u
k
−
1
+
α
k
−
2
u
k
−
2
+
β
k
−
2
u
k
−
3
+
γ
k
−
2
u
k
−
4
+
…
)
=
(
u
k
,
u
k
−
1
)
+
α
k
−
2
(
u
k
,
u
k
−
2
)
+
β
k
−
2
(
u
k
,
u
k
−
3
)
+
γ
k
−
2
(
u
k
,
u
k
−
4
)
+
…
)
=
0
{\displaystyle {\begin{aligned}(Au_{k},u_{k-2})&=(u_{k},Au_{k-2})\\&=(u_{k},u_{k-1}+\alpha _{k-2}u_{k-2}+\beta _{k-2}u_{k-3}+\gamma _{k-2}u_{k-4}+\ldots )\\&=(u_{k},u_{k-1})+\alpha _{k-2}(u_{k},u_{k-2})+\beta _{k-2}(u_{k},u_{k-3})+\gamma _{k-2}(u_{k},u_{k-4})+\ldots )\\&=0\end{aligned}}}
For which non-zero vectors
u
0
{\displaystyle u_{0}\!\,}
does
u
1
=
0
{\displaystyle u_{1}=0\!\,}
hold?
For
k
=
0
{\displaystyle k=0\!\,}
,
u
1
=
A
u
0
−
α
0
u
0
{\displaystyle u_{1}=Au_{0}-\alpha _{0}u_{0}\!\,}
If
u
1
=
0
{\displaystyle u_{1}=0\!\,}
, then
A
u
0
=
α
0
u
0
{\displaystyle Au_{0}=\alpha _{0}u_{0}\!\,}
Since
α
0
{\displaystyle \alpha _{0}\!\,}
is a scalar,
u
0
{\displaystyle u_{0}\!\,}
is an eigenvector of
A
{\displaystyle A\!\,}
.