# Calculus/Inverse function theorem, implicit function theorem

 ← The chain rule and Clairaut's theorem Calculus Vector calculus → Inverse function theorem, implicit function theorem

In this chapter, we want to prove the inverse function theorem (which asserts that if a function has invertible differential at a point, then it is locally invertible itself) and the implicit function theorem (which asserts that certain sets are the graphs of functions).

## Banach's fixed point theorem

Theorem:

Let $(M,d)$  be a complete metric space, and let $f:M\to M$  be a strict contraction; that is, there exists a constant $0\leq \lambda <1$  such that

$\forall m,n\in M:d(f(m),f(n))\leq \lambda d(m,n)$ .

Then $f$  has a unique fixed point, which means that there is a unique $x\in M$  such that $f(x)=x$ . Furthermore, if we start with a completely arbitrary point $y\in M$ , then the sequence

$y,f(y),f(f(y)),f(f(f(y))),\ldots$

converges to $x$ .

Proof:

First, we prove uniqueness of the fixed point. Assume $x,y$  are both fixed points. Then

$d(x,y)=d(f(x),f(y))\leq \lambda d(x,y)\Rightarrow (1-\lambda )d(x,y)=0$ .

Since $0\leq \lambda <1$ , this implies $d(x,y)=0\Rightarrow x=y$ .

Now we prove existence and simultaneously the claim about the convergence of the sequence $y,f(y),f(f(y)),f(f(f(y))),\ldots$ . For notation, we thus set $z_{0}:=y$  and if $z_{n}$  is already defined, we set $z_{n+1}=f(z_{n})$ . Then the sequence $(z_{n})_{n\in \mathbb {N} }$  is nothing else but the sequence $y,f(y),f(f(y)),f(f(f(y))),\ldots$ .

Let $n\geq 0$ . We claim that

$d(z_{n+1},z_{n})\leq \lambda ^{n}d(z_{1},z_{0})$ .

Indeed, this follows by induction on $n$ . The case $n=0$  is trivial, and if the claim is true for $n$ , then $d(z_{n+2},z_{n+1})=d(f(z_{n+1}),f(z_{n}))\leq \lambda d(z_{n+1},z_{n})\leq \lambda \cdot \lambda ^{n}d(z_{1},z_{0})$ .

Hence, by the triangle inequality,

{\begin{aligned}d(z_{n+m},z_{n})&\leq \sum _{j=n+1}^{n+m}d(z_{j},z_{j-1})\\&\leq \sum _{j=n+1}^{n+m}\lambda ^{j-1}d(z_{1},z_{0})\\&\leq \sum _{j=n+1}^{\infty }\lambda ^{j-1}d(z_{1},z_{0})\\&=d(z_{1},z_{0})\lambda ^{n}{\frac {1}{1-\lambda }}\end{aligned}} .

The latter expression goes to zero as $n\to \infty$  and hence we are dealing with a Cauchy sequence. As we are in a complete metric space, it converges to a limit $x$ . This limit further is a fixed point, as the continuity of $f$  ($f$  is Lipschitz continuous with constant $\lambda$ ) implies

$x=\lim _{n\to \infty }z_{n}=\lim _{n\to \infty }f(z_{n-1})=f(\lim _{n\to \infty }z_{n-1})=f(x)$ .$\Box$

A corollary to this important result is the following lemma, which shall be the main ingredient for the proof of the inverse function theorem:

Lemma:

Let $g:{\overline {B_{r}(0)}}\to {\overline {B_{r}(0)}}$  (${\overline {B_{r}(0)}}\subset \mathbb {R} ^{n}$  denoting the closed ball of radius $r$ ) be a function which is Lipschitz continuous with Lipschitz constant less or equal $1/2$  such that $g(0)=0$ . Then the function

$f:{\overline {B_{r}(0)}}\to \mathbb {R} ^{n},f(x):=g(x)+x$

is injective and $B_{r/2}(0)\subseteq f(B_{r}(0))$ .

Proof:

First, we note that for $y\in B_{r/2}(0)$  the function

$h:{\overline {B_{r}(0)}}\to \mathbb {R} ^{n},h(z):=y-g(z)$

is a strict contraction; this is due to

$\|y-g(z)-(y-g(z'))\|=\|g(z')-g(z)\|\leq {\frac {1}{2}}\|z-z'\|$ .

Furthermore, it maps ${\overline {B_{r}(0)}}$  to itself, since for $z\in {\overline {B_{r}(0)}}$

$\|y-g(z)\|\leq \|y\|+\|g(z-0)\|\leq {\frac {r}{2}}+{\frac {1}{2}}\|z\|\leq r$ .

Hence, the Banach fixed-point theorem is applicable to $h$ . Now $x$  being a fixed point of $h$  is equivalent to

$f(x)=y$ ,

and thus $B_{r/2}(0)\subseteq f(B_{r}(0))$  follows from the existence of fixed points. Furthermore, if $f(x)=f(x')$ , then

${\frac {1}{2}}\|x-x'\|\geq \|g(x)-g(x')\|=\|f(x)-x-(f(x')-x')\|=\|x-x'\|$

and hence $x=x'$ . Thus injectivity.$\Box$

## The inverse function theorem

Theorem:

Let $f:\mathbb {R} ^{n}\to \mathbb {R} ^{n}$  be a function which is continuously differentiable in a neighbourhood $x_{0}\in \mathbb {R} ^{n}$  such that $f'(x_{0})$  is invertible. Then there exists an open set $U\subseteq \mathbb {R} ^{n}$  with $x_{0}\in U$  such that $f|_{U}$  is a bijective function with an inverse $f^{-1}:f(U)\to U$  which is differentiable at $x_{0}$  and satisfies

$(f^{-1})'(x_{0})=(f'(x_{0}))^{-1}$ .

Proof:

We first reduce to the case $f(x_{0})=0$ , $x_{0}=0$  and $f'(x_{0})={\text{Id}}$ . Indeed, suppose for all those functions the theorem holds, and let now $h$  be an arbitrary function satisfying the requirements of the theorem (where the differentiability is given at $x_{0}$ ). We set

${\tilde {h}}(x):=h'(x_{0})^{-1}(h(x_{0}-x)-h(x_{0}))$

and obtain that ${\tilde {h}}$  is differentiable at $0$  with differential ${\text{Id}}$  and ${\tilde {h}}(0)=0$ ; the first property follows since we multiply both the function and the linear-affine approximation by $h'(x_{0})^{-1}$  and only shift the function, and the second one is seen from inserting $x=0$ . Hence, we obtain an inverse of ${\tilde {h}}$  with it's differential at ${\tilde {h}}(0)=0$ , and if we now set

$h^{-1}(y):=({\tilde {h}}^{-1}(h'(x_{0})^{-1}(y-h(x_{0})))-x_{0})$ ,

it can be seen that $h^{-1}$  is an inverse of $h$  with all the required properties (which is a bit of a tedious exercise, but involves nothing more than the definitions).

Thus let $f$  be a function such that $f(0)=0$ , $f$  is invertible at $0$  and $f'(0)={\text{Id}}$ . We define

$g(x):=f(x)-x$ .

The differential of this function is zero (since taking the differential is linear and the differential of the function $x\mapsto x$  is the identity). Since the function $g$  is also continuously differentiable at a small neighbourhood of $0$ , we find $\delta >0$  such that

${\frac {\partial g}{\partial x_{j}}}(x)<{\frac {1}{2n^{2}}}$

for all $j\in \{1,\ldots ,n\}$  and $x\in B_{\delta }(0)$ . Since further $g(0)=f(0)-0=0$ , the general mean-value theorem and Cauchy's inequality imply that for $k\in \{1,\ldots ,n\}$  and $x\in B_{\delta }(0)$ ,

$|g_{k}(x)|=|\langle x,{\frac {\partial g}{\partial x_{j}}}(t_{k}x)\rangle |\leq \|x\|n{\frac {1}{2n^{2}}}$

for suitable $t_{k}\in [0,1]$ . Hence,

$\|g(x)\|\leq |g_{1}(x)|+\cdots +|g_{n}(x)|\leq {\frac {1}{2}}\|x\|$  (triangle inequality),

and thus, we obtain that our preparatory lemma is applicable, and $f$  is a bijection on ${\overline {B_{\delta }(0)}}$ , whose image is contained within the open set ${\overline {B_{\delta /2}(0)}}$ ; thus we may pick $U:=f^{-1}(B_{\delta /2}(0))$ , which is open due to the continuity of $f$ .

Thus, the most important part of the theorem is already done. All that is left to do is to prove differentiability of $f^{-1}$  at $0$ . Now we even prove the slightly stronger claim that the differential of $f^{-1}$  at $x_{0}$  is given by the identity, although this would also follow from the chain rule once differentiability is proven.

Note now that the contraction identity for $g$  implies the following bounds on $f$ :

${\frac {1}{2}}\|x\|\leq \|f(x)\|\leq {\frac {3}{2}}\|x\|$ .

The second bound follows from

$\|f(x)\|\leq \|f(x)-x\|+\|x\|=\|g(x)\|+\|x\|\leq {\frac {3}{2}}\|x\|$ ,

and the first bound follows from

$\|f(x)\|\geq |\|f(x)-x\|-\|x\||=\left|\|g(x)\|-\|x\|\right|\geq {\frac {1}{2}}\|x\|$ .

Now for the differentiability at $0$ . We have, by substitution of limits (as $f$  is continuous and $f(0)=0$ ):

{\begin{aligned}\lim _{\mathbf {h} \to 0}{\frac {\|f^{-1}(\mathbf {h} )-f^{-1}(0)-\operatorname {Id} (\mathbf {h} -0)\|}{\|\mathbf {h} \|}}&=\lim _{\mathbf {h} \to 0}{\frac {\|f^{-1}(f(\mathbf {h} ))-f(\mathbf {h} )\|}{\|f(\mathbf {h} )\|}}\\&=\lim _{\mathbf {h} \to 0}{\frac {\|\mathbf {h} -f(\mathbf {h} )\|}{\|f(\mathbf {h} )\|}},\end{aligned}}

where the last expression converges to zero due to the differentiability of $f$  at $0$  with differential the identity, and the sandwhich criterion applied to the expressions

${\frac {\|\mathbf {h} -f(\mathbf {h} )\|}{{\frac {3}{2}}\|\mathbf {h} \|}}$

and

${\frac {\|\mathbf {h} -f(\mathbf {h} )\|}{{\frac {1}{2}}\|\mathbf {h} \|}}$ .$\Box$

## The implicit function theorem

Theorem:

Let $f:\mathbb {R} ^{n}\to \mathbb {R}$  be a continuously differentiable function, and consider the set

$S:=\{(x_{1},\ldots ,x_{n})\in \mathbb {R} ^{n}|f(x_{1},\ldots ,x_{n})=0\}$ .

If we are given some $y\in S$  such that $\partial _{n}f(y)\neq 0$ , then we find $U\subseteq \mathbb {R} ^{n-1}$  open with $(y_{1},\ldots ,y_{n-1})\in U$  and $g:U\to S$  such that

$y=g(y_{1},\ldots ,y_{n-1})$  and $\{(z_{1},\ldots ,z_{n-1},g(z_{1},\ldots ,z_{n-1}))|(z_{1},\ldots ,z_{n-1})\in U\}\subseteq S$ ,

where $\{(z_{1},\ldots ,z_{n-1},g(z_{1},\ldots ,z_{n-1}))|(z_{1},\ldots ,z_{n-1})\in U\}$  is open with respect to the subspace topology of $U$ .

Furthermore, $g$  is a differentiable function.

Proof:

We define a new function

$F:\mathbb {R} ^{n}\to \mathbb {R} ^{n},F(x_{1},\ldots ,x_{n}):=(x_{1},\ldots ,x_{n-1},f(x_{1},\ldots ,x_{n}))$ .

The differential of this function looks like this:

$F'(x)={\begin{pmatrix}1&0&\cdots &&0\\0&1&&&\vdots \\\vdots &&\ddots &&\\0&\cdots &0&1&0\\\partial _{1}f(x)&&\cdots &&\partial _{n}f(x)\end{pmatrix}}$

Since we assumed that $\partial _{n}f(y)\neq 0$ , $F'(y)$  is invertible, and hence the inverse function theorem implies the existence of a small open neighbourhood ${\tilde {V}}\subseteq \mathbb {R} ^{n}$  containing $y$  such that restricted to that neighbourhood $F$  is itself invertible, with a differentiable inverse $F^{-1}$ , which is itself defined on an open set ${\tilde {U}}$  containing $F(y)$ . Now set first

$U:=\{(x_{1},\ldots ,x_{n-1})|(x_{1},\ldots ,x_{n-1},0)\in {\tilde {U}}\}$ ,

which is open with respect to the subspace topology of $\mathbb {R} ^{n-1}$ , and then

$g:U\to \mathbb {R} ,g(x_{1},\ldots ,x_{n-1}):=\pi _{n}(F^{-1}(x_{1},\ldots ,x_{n-1},0))$ ,

the $n$ -th component of $F^{-1}(x_{1},\ldots ,x_{n-1},0)$ . We claim that $g$  has the desired properties.

Indeed, we first note that $F^{-1}(x_{1},\ldots ,x_{n-1},0)=(x_{1},\ldots ,x_{n-1},g(x_{1},\ldots ,x_{n-1}))$ , since applying $F$  leaves the first $n-1$  components unchanged, and thus we get the identity by observing $F(F^{-1}(x))=x$ . Let thus $(z_{1},\ldots ,z_{n-1})\in U$ . Then

{\begin{aligned}f(z_{1},\ldots ,z_{n-1},g(z_{1},\ldots ,z_{n-1}))&=(\pi _{n}\circ F)(F^{-1}(z_{1},\ldots ,z_{n-1},0))\\&=\pi _{n}((F\circ F^{-1})(z_{1},\ldots ,z_{n-1},0))=0\end{aligned}} .

Furthermore, the set

$\{(z_{1},\ldots ,z_{n-1},g(z_{1},\ldots ,z_{n-1}))|(z_{1},\ldots ,z_{n-1})\in U\}$

is open with respect to the subspace topology on $S$ . Indeed, we show

$\{(z_{1},\ldots ,z_{n-1},g(z_{1},\ldots ,z_{n-1}))|(z_{1},\ldots ,z_{n-1})\in U\}=S\cap {\tilde {V}}$ .

For $\subseteq$ , we first note that the set on the left hand side is in $S$ , since all points in it are mapped to zero by $f$ . Further,

$F(z_{1},\ldots ,z_{n-1},g(z_{1},\ldots ,z_{n-1}))=(z_{1},\ldots ,z_{n-1},0)\in {\tilde {U}}$

and hence $\subseteq$  is completed when applying $F^{-1}$ . For the other direction, let a point $(x_{1},\ldots ,x_{n})$  in $S\cap {\tilde {V}}$  be given, apply $F$  to get

$F((x_{1},\ldots ,x_{n}))=(x_{1},\ldots ,x_{n-1},0)\in {\tilde {U}}$

and hence $(x_{1},\ldots ,x_{n-1})\in U$ ; further

$(x_{1},\ldots ,x_{n-1},g(x_{1},\ldots ,x_{n-1}))=(x_{1},\ldots ,x_{n})$

by applying $F$  to both sides of the equation.

Now $g$  is automatically differentiable as the component of a differentiable function.$\Box$

Informally, the above theorem states that given a set $\{x\in \mathbb {R} ^{n}|f(x)=0\}$ , one can choose the first $n-1$  coordinates as a "base" for a function, whose graph is precisely a local bit of that set.