Let
a
=
x
0
<
x
1
<
⋯
<
x
n
=
b
{\displaystyle a=x_{0}<x_{1}<\cdots <x_{n}=b\!\,}
be an arbitrary fixed partition of the interval
[
a
,
b
]
{\displaystyle [a,b]\!\,}
. A function
q
(
x
)
{\displaystyle q(x)\!\,}
is a quadratic spline function if
(i)
q
∈
C
1
[
a
,
b
]
{\displaystyle q\in C^{1}[a,b]\!\,}
(ii) On each subinterval
[
x
i
,
x
i
+
1
]
{\displaystyle [x_{i},x_{i+1}]\!\,}
,
q
{\displaystyle q\!\,}
is a quadratic polynomial
The problem is to construct a quadratic spline
q
(
x
)
{\displaystyle q(x)\!\,}
interpolating data points
(
x
0
,
y
0
)
,
(
x
1
,
y
1
)
,
…
,
(
x
n
,
y
n
)
{\displaystyle (x_{0},y_{0}),(x_{1},y_{1}),\ldots ,(x_{n},y_{n})\!\,}
. The construction is similar to the construction of the cubic spline but much easier.
Show that if we know
z
i
≡
q
′
(
x
i
)
,
i
=
0
,
…
,
n
{\displaystyle z_{i}\equiv q'(x_{i}),i=0,\ldots ,n\!\,}
, we can construct
q
{\displaystyle q\!\,}
.
Consider interval
[
x
i
,
x
i
+
1
]
{\displaystyle [x_{i},x_{i+1}]\!\,}
. Since
q
i
′
(
x
)
{\displaystyle q'_{i}(x)\!\,}
is linear on this interval, using the point slope form we have
q
i
′
(
x
)
=
z
i
+
1
−
z
i
x
i
+
1
−
x
i
(
x
−
x
i
)
+
z
i
{\displaystyle q_{i}'(x)={\frac {z_{i+1}-z_{i}}{x_{i+1}-x_{i}}}(x-x_{i})+z_{i}\!\,}
Integrating, we have
q
i
(
x
)
=
z
i
x
+
z
i
+
1
−
z
i
x
i
+
1
−
x
i
(
x
−
x
i
)
2
2
+
K
i
{\displaystyle q_{i}(x)=z_{i}x+{\frac {z_{i+1}-z_{i}}{x_{i+1}-x_{i}}}{\frac {(x-x_{i})^{2}}{2}}+K_{i}\!\,}
or, in a more convenient form,
q
i
(
x
)
=
z
i
(
x
−
x
i
)
+
z
i
+
1
−
z
i
x
i
+
1
−
x
i
(
x
−
x
i
)
2
2
+
y
i
{\displaystyle q_{i}(x)=z_{i}(x-x_{i})+{\frac {z_{i+1}-z_{i}}{x_{i+1}-x_{i}}}{\frac {(x-x_{i})^{2}}{2}}+y_{i}\!\,}
Since
q
{\displaystyle q\!\,}
is continuous on
[
a
,
b
]
{\displaystyle [a,b]\!\,}
,
q
i
−
1
(
x
i
)
=
q
i
(
x
i
)
=
y
i
{\displaystyle q_{i-1}(x_{i})=q_{i}(x_{i})=y_{i}\!\,}
i.e.
z
i
−
1
(
x
i
−
x
i
−
1
)
+
z
i
−
z
i
−
1
x
i
−
x
i
−
1
(
x
i
−
x
i
−
1
)
2
2
+
y
i
−
1
=
y
i
{\displaystyle z_{i-1}(x_{i}-x_{i-1})+{\frac {z_{i}-z_{i-1}}{x_{i}-x_{i-1}}}{\frac {(x_{i}-x_{i-1})^{2}}{2}}+y_{i-1}=y_{i}\!\,}
Simplifying and rearranging terms yields the reoccurrence formula
z
i
=
−
z
i
−
1
+
2
(
y
i
−
y
i
−
1
)
x
i
−
x
i
−
1
i
=
1
,
…
n
{\displaystyle z_{i}=-z_{i-1}+{\frac {2(y_{i}-y_{i-1})}{x_{i}-x_{i-1}}}\quad i=1,\ldots n\!\,}
Q
k
R
k
=
A
k
{\displaystyle Q_{k}R_{k}=A_{k}\!\,}
A
k
+
1
=
R
k
Q
k
{\displaystyle A_{k+1}=R_{k}Q_{k}\!\,}
Q
k
R
k
=
A
k
−
χ
k
I
{\displaystyle Q_{k}R_{k}=A_{k}-\chi _{k}I\!\,}
A
k
+
1
=
R
k
Q
k
+
χ
k
I
{\displaystyle A_{k+1}=R_{k}Q_{k}+\chi _{k}I\!\,}
Suppose
A
m
=
U
T
A
U
{\displaystyle A_{m}=U^{T}AU}
for some unitary
U
{\displaystyle U}
. Since
R
m
=
Q
m
T
A
m
{\displaystyle R_{m}=Q_{m}^{T}A_{m}}
we have
A
m
+
1
=
R
m
Q
m
=
Q
m
T
A
m
Q
m
=
Q
m
T
U
T
A
U
Q
m
{\displaystyle A_{m+1}=R_{m}Q_{m}=Q_{m}^{T}A_{m}Q_{m}=Q_{m}^{T}U^{T}AUQ_{m}}
which is as desired as
Q
m
U
{\displaystyle Q_{m}U}
is unitary.
For the shifted case, the same argument holds using the fact that
R
m
=
Q
m
T
(
A
m
−
χ
m
I
)
{\displaystyle R_{m}=Q_{m}^{T}(A_{m}-\chi _{m}I)}
.
Let
A
=
(
3
1
1
2
)
{\displaystyle A={\begin{pmatrix}3&1\\1&2\end{pmatrix}}\!\,}
Use plane rotations (Givens rotations) to carry out one step of the
Q
R
{\displaystyle QR\!\,}
algorithm on
A
{\displaystyle A\!\,}
, first without shifting and then using the shift
χ
0
=
1
{\displaystyle \chi _{0}=1\!\,}
. Which seems to do better? The eigenvalues of
A
{\displaystyle A\!\,}
are
λ
1
=
1.382
,
λ
2
=
3.618
{\displaystyle \lambda _{1}=1.382,\lambda _{2}=3.618\!\,}
. (Recall that a plane rotation is a matrix of the form
Q
=
(
c
−
s
s
c
)
{\displaystyle Q={\begin{pmatrix}c&-s\\s&c\end{pmatrix}}\!\,}
with
c
2
+
s
2
=
1
{\displaystyle c^{2}+s^{2}=1\!\,}
The shifted iteration appears to work better because its diagonal entries are closer to the actual eigenvalues than the diagonal entries of the unshifted iteration.
A
1
=
(
3.5
.5
.5
1.5
)
{\displaystyle A_{1}={\begin{pmatrix}3.5&.5\\.5&1.5\end{pmatrix}}\!\,}
A
1
=
(
3.6
.2
.2
1.4
)
{\displaystyle A_{1}={\begin{pmatrix}3.6&.2\\.2&1.4\end{pmatrix}}\!\,}
Let
A
{\displaystyle A\!\,}
be an
N
×
N
{\displaystyle N\times N\!\,}
symmetric, positive definite matrix. Then we know that solving
A
x
=
b
{\displaystyle Ax=b\!\,}
is equivalent to minimizing the functional
F
(
x
)
=
1
2
⟨
x
,
A
x
⟩
−
⟨
b
,
x
⟩
{\displaystyle F(x)={\frac {1}{2}}\langle x,Ax\rangle -\langle b,x\rangle \!\,}
where
⟨
⋅
,
⋅
⟩
{\displaystyle \langle \cdot ,\cdot \rangle \!\,}
denotes the standard inner product in
R
N
{\displaystyle R^{N}\!\,}
. To solve the problem by minimization of
F
{\displaystyle F\!\,}
we consider the general iterative method
x
v
+
1
=
x
v
+
α
v
d
v
{\displaystyle x_{v+1}=x_{v}+\alpha _{v}d_{v}\!\,}
When
x
v
{\displaystyle x_{v}\!\,}
and
d
v
{\displaystyle d_{v}\!\,}
are given, show that the value of
α
v
{\displaystyle \alpha _{v}\!\,}
which minimizes
F
(
x
v
+
α
d
v
)
{\displaystyle F(x_{v}+\alpha d_{v})\!\,}
as a function of
α
{\displaystyle \alpha \!\,}
is given in terms of the residual
r
v
{\displaystyle r_{v}\!\,}
α
v
=
⟨
r
v
,
d
v
⟩
⟨
d
v
,
A
d
v
⟩
,
r
v
=
b
−
A
x
v
{\displaystyle \alpha _{v}={\frac {\langle r_{v},d_{v}\rangle }{\langle d_{v},Ad_{v}\rangle }},\quad r_{v}=b-Ax_{v}\!\,}
Since
A
{\displaystyle A\!\,}
is symmetric
(
A
x
,
y
)
=
(
x
,
A
T
y
)
=
(
x
,
A
y
)
{\displaystyle (Ax,y)=(x,A^{T}y)=(x,Ay)\!\,}
This relationship will used throughout the solutions.
Substitute into Functional
edit
F
(
x
v
+
α
d
v
)
=
1
2
⟨
x
v
+
α
d
v
,
A
x
v
+
α
A
d
v
⟩
−
⟨
b
,
x
v
+
α
d
v
⟩
=
⟨
d
v
,
A
d
v
⟩
2
α
2
+
α
⟨
d
v
,
A
x
v
⟩
−
α
⟨
b
,
d
v
⟩
+
1
2
⟨
x
v
,
A
x
v
⟩
−
⟨
b
,
x
v
⟩
{\displaystyle {\begin{aligned}F(x_{v}+\alpha d_{v})&={\frac {1}{2}}\langle x_{v}+\alpha d_{v},Ax_{v}+\alpha Ad_{v}\rangle -\langle b,x_{v}+\alpha d_{v}\rangle \\&={\frac {\langle d_{v},Ad_{v}\rangle }{2}}\alpha ^{2}+\alpha \langle d_{v},Ax_{v}\rangle -\alpha \langle b,d_{v}\rangle +{\frac {1}{2}}\langle x_{v},Ax_{v}\rangle -\langle b,x_{v}\rangle \end{aligned}}\!\,}
Take Derivative With Respect to Alpha
edit
F
′
(
x
v
+
α
d
v
)
=
⟨
d
v
,
A
d
v
⟩
α
+
⟨
d
v
,
A
x
v
⟩
−
⟨
b
,
d
v
⟩
{\displaystyle F'(x_{v}+\alpha d_{v})=\langle d_{v},Ad_{v}\rangle \alpha +\langle d_{v},Ax_{v}\rangle -\langle b,d_{v}\rangle \!\,}
Set Derivative Equal To Zero
edit
0
=
⟨
d
v
,
A
d
v
⟩
α
+
⟨
d
v
,
A
x
v
−
b
⟩
{\displaystyle 0=\langle d_{v},Ad_{v}\rangle \alpha +\langle d_{v},Ax_{v}-b\rangle \!\,}
which implies
α
=
⟨
d
v
,
r
v
⟩
⟨
d
v
,
A
d
v
⟩
{\displaystyle \alpha ={\frac {\langle d_{v},r_{v}\rangle }{\langle d_{v},Ad_{v}\rangle }}\!\,}
Let
{
d
j
}
j
=
1
N
{\displaystyle \{d_{j}\}_{j=1}^{N}\!\,}
be an
A
{\displaystyle A\!\,}
-orthogonal basis of
R
N
{\displaystyle R^{N}\!\,}
,
⟨
d
j
,
A
d
k
⟩
=
0
,
j
≠
k
{\displaystyle \langle d_{j},Ad_{k}\rangle =0,j\neq k\!\,}
. Consider the expansion of the solution
x
{\displaystyle x\!\,}
in this basis:
x
=
∑
j
=
1
N
x
j
^
d
j
{\displaystyle x=\sum _{j=1}^{N}{\hat {x_{j}}}d_{j}\!\,}
Use that
A
−
{\displaystyle A-\!\,}
orthogonality of the
d
j
{\displaystyle d_{j}\!\,}
to compute the
x
^
j
{\displaystyle {\hat {x}}_{j}\!\,}
in terms of the solution
x
{\displaystyle x\!\,}
and the
d
j
{\displaystyle d_{j}\!\,}
's
⟨
x
,
A
d
k
⟩
=
⟨
∑
j
=
1
n
x
^
j
d
j
,
A
d
k
⟩
=
∑
j
=
1
N
x
^
j
⟨
d
j
,
A
d
k
⟩
=
x
^
k
⟨
d
k
,
A
d
k
⟩
{\displaystyle {\begin{aligned}\langle x,Ad_{k}\rangle &=\langle \sum _{j=1}^{n}{\hat {x}}_{j}d_{j},Ad_{k}\rangle \\&=\sum _{j=1}^{N}{\hat {x}}_{j}\langle d_{j},Ad_{k}\rangle \\&={\hat {x}}_{k}\langle d_{k},Ad_{k}\rangle \end{aligned}}\!\,}
which implies
x
^
k
=
⟨
x
,
A
d
k
⟩
⟨
d
k
,
A
d
k
⟩
{\displaystyle {\hat {x}}_{k}={\frac {\langle x,Ad_{k}\rangle }{\langle d_{k},Ad_{k}\rangle }}\!\,}
Let
x
v
{\displaystyle x_{v}\!\,}
denote the partial sum
x
v
=
∑
j
=
1
v
−
1
x
^
j
d
j
,
v
=
2
,
3
,
…
{\displaystyle x_{v}=\sum _{j=1}^{v-1}{\hat {x}}_{j}d_{j},\quad v=2,3,\ldots \!\,}
so that
x
v
+
1
=
x
v
+
x
^
v
d
v
{\displaystyle x_{v+1}=x_{v}+{\hat {x}}_{v}d_{v}\!\,}
where the
x
^
v
{\displaystyle {\hat {x}}_{v}\!\,}
's are the coefficients found in (b). Use the fact that
x
v
∈
span
{
d
1
,
d
2
,
…
,
d
v
−
1
}
{\displaystyle x_{v}\in {\mbox{span}}\{d_{1},d_{2},\ldots ,d_{v-1}\}\!\,}
and the
A
{\displaystyle A\!\,}
-orthogonality of the
d
j
{\displaystyle d_{j}\!\,}
's to conclude that the coefficient
x
^
v
{\displaystyle {\hat {x}}_{v}\!\,}
is given by the optimal
α
v
{\displaystyle \alpha _{v}\!\,}
i.e.
x
^
v
=
⟨
r
v
,
d
v
⟩
⟨
d
v
,
A
d
v
⟩
=
α
v
{\displaystyle {\hat {x}}_{v}={\frac {\langle r_{v},d_{v}\rangle }{\langle d_{v},Ad_{v}\rangle }}=\alpha _{v}\!\,}
⟨
r
v
,
d
v
⟩
=
⟨
b
−
A
x
v
,
d
v
⟩
=
⟨
A
x
−
A
x
v
,
d
v
⟩
=
⟨
A
x
,
d
v
⟩
−
⟨
A
x
v
,
d
v
⟩
=
⟨
x
,
A
d
v
⟩
−
⟨
x
v
,
A
d
v
⟩
⏟
0
=
x
^
v
⟨
d
v
,
A
d
v
⟩
{\displaystyle {\begin{aligned}\langle r_{v},d_{v}\rangle &=\langle b-Ax_{v},d_{v}\rangle \\&=\langle Ax-Ax_{v},dv\rangle \\&=\langle Ax,d_{v}\rangle -\langle Ax_{v},d_{v}\rangle \\&=\langle x,Ad_{v}\rangle -\underbrace {\langle x_{v},Ad_{v}\rangle } _{0}\\&={\hat {x}}_{v}\langle d_{v},Ad_{v}\rangle \end{aligned}}\!\,}
which implies
x
^
v
=
⟨
r
v
,
d
v
⟩
⟨
d
v
,
A
d
v
⟩
{\displaystyle {\hat {x}}_{v}={\frac {\langle r_{v},d_{v}\rangle }{\langle d_{v},Ad_{v}\rangle }}\!\,}