State Newton's method for the approximate solution of
f
(
x
)
=
0
{\displaystyle f(x)=0\!\,}
where
f
(
x
)
{\displaystyle f(x)\!\,}
is a real-valued function of the real variable
x
{\displaystyle x\!\,}
x
k
+
1
=
x
k
−
f
(
x
k
)
f
′
(
x
k
)
{\displaystyle x_{k+1}=x_{k}-{\frac {f(x_{k})}{f'(x_{k})}}\!\,}
State and prove a convergence result for the method.
x
k
+
1
=
x
k
−
f
′
(
x
k
)
−
1
f
(
x
k
)
x
k
+
1
−
x
∗
⏟
e
k
+
1
=
x
k
−
x
∗
⏟
e
k
−
f
′
(
x
k
)
−
1
f
(
x
k
)
=
f
′
(
x
k
)
−
1
[
f
′
(
x
k
)
e
k
−
(
f
(
x
k
)
−
f
(
x
∗
)
⏟
0
)
)
=
f
′
(
x
k
)
−
1
[
f
′
(
x
k
)
e
k
−
e
k
∫
0
1
f
′
(
s
x
k
+
(
1
−
s
)
x
∗
)
d
s
]
=
f
′
(
x
k
)
−
1
[
f
′
(
x
k
)
e
k
−
f
′
(
x
∗
)
e
k
−
e
k
∫
0
1
(
f
′
(
s
x
k
+
(
1
−
s
)
x
∗
)
−
f
′
(
x
∗
)
)
d
s
]
‖
e
k
+
1
‖
≤
‖
f
′
(
x
k
)
−
1
‖
[
L
‖
e
k
‖
2
+
L
2
‖
e
k
‖
2
]
where L is Lipschitz constant of
f
′
=
‖
f
′
(
x
k
)
−
1
‖
3
2
L
‖
e
k
‖
2
{\displaystyle {\begin{aligned}x_{k+1}&=x_{k}-f'(x_{k})^{-1}f(x_{k})\\\underbrace {x_{k+1}-x_{*}} _{e_{k+1}}&=\underbrace {x_{k}-x_{*}} _{e_{k}}-f'(x_{k})^{-1}f(x_{k})\\&=f'(x_{k})^{-1}[f'(x_{k})e_{k}-(f(x_{k})-\underbrace {f(x_{*})} _{0}))\\&=f'(x_{k})^{-1}\left[f'(x_{k})e_{k}-e_{k}\int _{0}^{1}f'(sx_{k}+(1-s)x_{*})ds\right]\\&=f'(x_{k})^{-1}\left[f'(x_{k})e_{k}-f'(x_{*})e_{k}-e_{k}\int _{0}^{1}(f'(sx_{k}+(1-s)x_{*})-f'(x_{*}))ds\right]\\\|e_{k+1}\|&\leq \|f'(x_{k})^{-1}\|\left[L\|e_{k}\|^{2}+{\frac {L}{2}}\|e_{k}\|^{2}\right]{\mbox{ where L is Lipschitz constant of }}f'\\&=\|f'(x_{k})^{-1}\|{\frac {3}{2}}L\|e_{k}\|^{2}\end{aligned}}\!\,}
What is the typical order of convergence? Are there situations in which the order of convergence is higher? Explain your answers to these questions.
The typical order of local convergence is quadratic.
Consider the Newton's method as a fixed point iteration i.e.:
x
k
+
1
=
x
k
−
f
(
x
k
)
f
′
(
x
k
)
⏟
g
(
x
k
)
{\displaystyle x_{k+1}=\underbrace {x_{k}-{\frac {f(x_{k})}{f'(x_{k})}}} _{g(x_{k})}\!\,}
Then
g
′
(
x
k
)
=
1
−
f
′
(
x
k
)
2
−
f
′
(
x
k
)
f
(
x
k
)
f
′
(
x
k
)
2
{\displaystyle g'(x_{k})=1-{\frac {f'(x_{k})^{2}-f'(x_{k})f(x_{k})}{f'(x_{k})^{2}}}\!\,}
g
′
(
x
∗
)
=
0
{\displaystyle g'(x_{*})=0\!\,}
Expanding
g
(
x
k
)
{\displaystyle g(x_{k})\!\,}
around
x
∗
{\displaystyle x_{*}\!\,}
gives an expression for the error
g
(
x
k
)
⏟
x
k
+
1
=
g
(
x
∗
)
⏟
x
∗
+
g
′
(
x
∗
)
⏟
0
e
k
+
g
″
(
x
∗
)
2
e
k
2
+
g
‴
(
ξ
)
6
e
k
3
{\displaystyle \underbrace {g(x_{k})} _{x_{k+1}}=\underbrace {g(x_{*})} _{x_{*}}+\underbrace {g'(x_{*})} _{0}e_{k}+{\frac {g''(x_{*})}{2}}e_{k}^{2}+{\frac {g'''(\xi )}{6}}e_{k}^{3}\!\,}
Note that if
g
″
(
x
∗
)
=
0
{\displaystyle g''(x_{*})=0\!\,}
, then we have better than quadratic convergence.
Consider the boundary value problem
{
−
u
″
(
x
)
+
u
(
x
)
=
f
(
x
)
,
0
≤
x
≤
1
u
′
(
0
)
=
2
,
u
(
1
)
=
0
(
1
)
{\displaystyle {\begin{cases}-u''(x)+u(x)=f(x),0\leq x\leq 1\\u'(0)=2,u(1)=0\end{cases}}\qquad (1)\!\,}
Derive a variational formulation for (1).
Find
u
∈
H
{\displaystyle u\in H\!\,}
such that for all
v
∈
H
=
{
v
∈
H
1
[
0
,
1
]
:
v
(
1
)
=
0
}
{\displaystyle v\in H=\{v\in H^{1}[0,1]:v(1)=0\}\!\,}
∫
0
1
u
′
v
′
d
x
+
∫
0
1
u
v
⏟
a
(
u
,
v
)
=
∫
0
1
f
v
−
2
v
(
0
)
⏟
f
(
v
)
{\displaystyle \underbrace {\int _{0}^{1}u'v'dx+\int _{0}^{1}uv} _{a(u,v)}=\underbrace {\int _{0}^{1}fv-2v(0)} _{f(v)}\!\,}
What do we mean by Finite Element Approximation
u
h
{\displaystyle u_{h}\!\,}
to
u
{\displaystyle u\!\,}
Let
P
=
{
x
i
}
i
=
0
N
{\displaystyle P=\{x_{i}\}_{i=0}^{N}\!\,}
be a partition of
[
0
,
1
]
{\displaystyle [0,1]\!\,}
. Choose a an appropriate discrete subspace
V
h
{\displaystyle V_{h}\!\,}
and basis functions
{
ϕ
i
}
i
=
0
N
{\displaystyle \{\phi _{i}\}_{i=0}^{N}\!\,}
. Then
u
h
=
∑
i
=
0
N
u
h
i
ϕ
i
{\displaystyle u_{h}=\sum _{i=0}^{N}u_{hi}\phi _{i}\!\,}
The coefficients
u
h
i
{\displaystyle u_{hi}\!\,}
can be found by solving the following system of equations:
For
j
=
0
,
1
,
…
,
N
{\displaystyle j=0,1,\ldots ,N\!\,}
∑
i
=
0
N
u
h
i
∫
0
1
ϕ
i
′
ϕ
j
′
d
x
+
∑
i
=
0
N
u
h
i
∫
0
1
ϕ
i
ϕ
j
d
x
=
∫
0
1
f
ϕ
j
−
2
ϕ
j
(
0
)
{\displaystyle \sum _{i=0}^{N}u_{hi}\int _{0}^{1}\phi _{i}'\phi _{j}'dx+\sum _{i=0}^{N}u_{hi}\int _{0}^{1}\phi _{i}\phi _{j}dx=\int _{0}^{1}f\phi _{j}-2\phi _{j}(0)\!\,}
State and prove an estimate for
‖
u
−
u
h
‖
1
≡
(
∫
0
1
|
u
(
x
)
−
u
h
(
x
)
|
2
d
x
+
∫
0
1
|
u
′
(
x
)
−
u
h
′
(
x
)
|
2
d
x
)
1
2
{\displaystyle \|u-u_{h}\|_{1}\equiv \left(\int _{0}^{1}|u(x)-u_{h}(x)|^{2}dx+\int _{0}^{1}|u'(x)-u_{h}'(x)|^{2}dx\right)^{\frac {1}{2}}\!\,}
Cea's Lemma:
‖
u
−
u
h
‖
1
≤
inf
v
∈
V
h
C
‖
u
−
v
‖
1
{\displaystyle \|u-u_{h}\|_{1}\leq \inf _{v\in V_{h}}C\|u-v\|_{1}\!\,}
In particular choose
v
I
{\displaystyle v_{I}\!\,}
to be the linear interpolant of
u
{\displaystyle u\!\,}
.
Then,
‖
u
−
u
h
‖
≤
C
‖
u
″
‖
∞
h
{\displaystyle \|u-u_{h}\|\leq C\|u''\|_{\infty }h\!\,}
Alternative Solution 6c
edit
Let
{
x
k
}
0
n
{\displaystyle \{x_{k}\}_{0}^{n}}
be a discrete mesh of
[
0
,
1
]
{\displaystyle [0,1]}
with step size
h
{\displaystyle h}
. Consider the following integral
∫
x
k
x
k
+
1
|
u
(
x
)
−
u
h
(
x
)
|
2
d
x
{\displaystyle \int _{x_{k}}^{x_{k+1}}|u(x)-u_{h}(x)|^{2}dx}
.
For some
ξ
k
∈
[
x
k
,
x
k
+
1
]
{\displaystyle \xi _{k}\in [x_{k},x_{k+1}]}
,
u
(
x
)
−
u
h
(
x
)
=
u
′
(
ξ
k
)
∗
h
{\displaystyle u(x)-u_{h}(x)=u'(\xi _{k})*h}
as
u
h
{\displaystyle u_{h}}
is just a linear interpolation on this interval. Hence
∫
x
k
x
k
+
1
|
u
(
x
)
−
u
h
(
x
)
|
2
d
x
=
∫
x
k
x
k
+
1
h
2
u
′
(
ξ
k
)
2
d
x
=
h
3
u
′
(
ξ
k
)
2
≤
h
3
|
|
u
′
|
|
∞
{\displaystyle \int _{x_{k}}^{x_{k+1}}|u(x)-u_{h}(x)|^{2}dx=\int _{x_{k}}^{x_{k+1}}h^{2}u'(\xi _{k})^{2}dx=h^{3}u'(\xi _{k})^{2}\leq h^{3}||u'||_{\infty }}
.
Similarly, we can bound the
L
2
(
x
k
,
x
k
+
h
)
{\displaystyle L_{2}(x_{k},x_{k}+h)}
norm of the error in the derivatives with
h
3
|
|
u
″
|
|
∞
{\displaystyle h^{3}||u''||_{\infty }}
. With
n
=
1
/
h
{\displaystyle n=1/h}
such intervals we have
|
|
u
−
u
h
‖
|
2
≤
2
n
h
3
max
(
|
|
u
′
|
|
∞
2
,
|
|
u
″
|
|
∞
2
)
=
2
h
2
max
(
|
|
u
′
|
|
∞
2
,
|
|
u
″
|
|
∞
2
)
.
{\displaystyle \ ||u-u_{h}\||^{2}\leq 2nh^{3}\max(||u'||_{\infty }^{2},||u''||_{\infty }^{2})=2h^{2}\max(||u'||_{\infty }^{2},||u''||_{\infty }^{2}).}
Prove the formula
‖
u
−
u
h
‖
1
2
=
‖
u
‖
1
2
−
‖
u
h
‖
1
2
{\displaystyle \|u-u_{h}\|_{1}^{2}=\|u\|_{1}^{2}-\|u_{h}\|_{1}^{2}\!\,}
a
(
u
−
u
h
,
u
−
u
h
)
⏟
‖
u
−
u
h
‖
1
2
+
2
a
(
u
−
u
h
,
u
h
)
⏟
0
=
a
(
u
,
u
)
−
2
a
(
u
,
u
h
)
+
a
(
u
h
,
u
h
)
+
2
a
(
u
,
u
h
)
−
2
a
(
u
h
,
u
h
)
=
a
(
u
,
u
)
−
a
(
u
h
,
u
h
)
=
‖
u
‖
1
2
−
‖
u
h
‖
1
2
{\displaystyle {\begin{aligned}\underbrace {a(u-u_{h},u-u_{h})} _{\|u-u_{h}\|_{1}^{2}}+2\underbrace {a(u-u_{h},u_{h})} _{0}&=a(u,u)-2a(u,u_{h})+a(u_{h},u_{h})+2a(u,u_{h})-2a(u_{h},u_{h})\\&=a(u,u)-a(u_{h},u_{h})\\&=\|u\|_{1}^{2}-\|u_{h}\|_{1}^{2}\end{aligned}}\!\,}
Consider the initial value problem
y
′
=
f
(
t
,
y
)
,
y
(
t
0
)
=
y
0
{\displaystyle y'=f(t,y),\quad y(t_{0})=y_{0}\!\,}
where
f
{\displaystyle f\!\,}
satisfies the Lipschitz condition
|
f
(
t
,
y
)
−
f
(
t
,
z
)
|
≤
L
|
y
−
z
|
{\displaystyle |f(t,y)-f(t,z)|\leq L|y-z|\!\,}
for all
t
,
y
,
z
{\displaystyle t,y,z\!\,}
. A numerical method called the midpoint rule for solving this problem is defined by
y
n
+
1
=
y
n
−
1
+
2
h
f
(
t
n
,
y
n
)
{\displaystyle y_{n+1}=y_{n-1}+2hf(t_{n},y_{n})\!\,}
where
h
{\displaystyle h\!\,}
is a time step and
y
n
≈
y
(
t
n
)
{\displaystyle y_{n}\approx y(t_{n})\!\,}
for
t
n
=
t
0
+
n
h
{\displaystyle t_{n}=t_{0}+nh\!\,}
. Here
y
0
{\displaystyle y_{0}\!\,}
is given and
y
1
{\displaystyle y_{1}\!\,}
is presumed to be computed by some other method.
Suppose the problem is posed on a finite interval
t
0
≤
t
≤
b
{\displaystyle t_{0}\leq t\leq b\!\,}
where
h
=
(
b
−
t
0
)
M
{\displaystyle h={\frac {(b-t_{0})}{M}}\!\,}
. Show directly,i.e., without citing any major results, that the midpoint rule is stable. That is show that if
y
0
^
{\displaystyle {\hat {y_{0}}}\!\,}
and
y
1
^
{\displaystyle {\hat {y_{1}}}\!\,}
satisfy
|
y
0
−
y
0
^
|
≤
ϵ
,
|
y
1
−
y
1
^
|
≤
ϵ
{\displaystyle |y_{0}-{\hat {y_{0}}}|\leq \epsilon ,\quad |y_{1}-{\hat {y_{1}}}|\leq \epsilon \!\,}
then there exists a constant
C
{\displaystyle C\!\,}
independent of
h
{\displaystyle h\!\,}
such that
|
y
n
−
y
n
^
|
≤
C
ϵ
{\displaystyle |y_{n}-{\hat {y_{n}}}|\leq C\epsilon \!\,}
for
0
≤
n
≤
M
{\displaystyle 0\leq n\leq M\!\,}
y
n
+
1
=
y
n
−
1
+
2
h
f
(
t
n
,
y
n
)
y
^
n
+
1
=
y
^
n
−
1
+
2
h
f
(
t
n
,
y
^
n
)
{\displaystyle {\begin{aligned}y_{n+1}&=y_{n-1}+2hf(t_{n},y_{n})\\{\hat {y}}_{n+1}&={\hat {y}}_{n-1}+2hf(t_{n},{\hat {y}}_{n})\end{aligned}}}
Subtracting both equations, letting
e
n
≡
y
n
−
y
^
n
{\displaystyle e_{n}\equiv y_{n}-{\hat {y}}_{n}\!\,}
, and applying the Lipschitz property yields,
e
n
+
1
≤
e
n
−
1
+
2
h
L
e
n
≤
(
1
+
2
h
L
)
max
{
e
n
,
e
n
−
1
}
{\displaystyle {\begin{aligned}e_{n+1}&\leq e_{n-1}+2hLe_{n}\\&\leq (1+2hL)\max\{e_{n},e_{n-1}\}\end{aligned}}\!\,}
Therefore,
e
n
≤
(
1
+
2
h
L
)
n
−
1
ϵ
≤
(
1
+
2
h
L
)
n
ϵ
≤
e
2
L
h
n
ϵ
≤
e
2
L
(
t
n
−
t
0
)
ϵ
{\displaystyle {\begin{aligned}e_{n}&\leq (1+2hL)^{n-1}\epsilon \\&\leq (1+2hL)^{n}\epsilon \\&\leq e^{2Lhn}\epsilon \\&\leq e^{2L(t_{n}-t_{0})}\epsilon \end{aligned}}\!\,}
Substituting into the midpoint rule we have,
y
n
+
1
=
y
n
−
1
+
2
h
λ
y
n
{\displaystyle y_{n+1}=y_{n-1}+2h\lambda y_{n}\!\,}
or
y
n
+
1
−
2
h
λ
y
n
−
y
n
−
1
=
0
{\displaystyle y_{n+1}-2h\lambda y_{n}-y_{n-1}=0\!\,}
The solution of this equation is given by
y
n
=
C
1
z
1
n
+
C
2
z
2
n
{\displaystyle y_{n}=C_{1}z_{1}^{n}+C_{2}z_{2}^{n}\!\,}
where
z
1
,
z
2
{\displaystyle z_{1},z_{2}\!\,}
or the roots of the quadratic
z
2
−
2
h
λ
z
−
1
{\displaystyle z^{2}-2h\lambda z-1\!\,}
The quadratic formula yields
z
=
h
λ
(
1
±
1
+
4
4
h
2
λ
2
)
{\displaystyle z=h\lambda \left(1\pm {\sqrt {1+{\frac {4}{4h^{2}\lambda ^{2}}}}}\right)\!\,}
If
λ
{\displaystyle \lambda \!\,}
is a small negative number, than one of the roots will be greater than 1. Hence,
y
n
→
∞
{\displaystyle y_{n}\rightarrow \infty \!\,}
as
n
→
∞
{\displaystyle n\rightarrow \infty \!\,}
instead of converging to zero since
y
(
t
)
=
e
λ
t
{\displaystyle y(t)=e^{\lambda t}\!\,}
.