Calculus/Inverse function theorem, implicit function theorem

← The chain rule and Clairaut's theorem Calculus Vector calculus →
Inverse function theorem, implicit function theorem

In this chapter, we want to prove the inverse function theorem (which asserts that if a function has invertible differential at a point, then it is locally invertible itself) and the implicit function theorem (which asserts that certain sets are the graphs of functions).

Banach's fixed point theorem edit

Theorem:

Let   be a complete metric space, and let   be a strict contraction; that is, there exists a constant   such that

 .

Then   has a unique fixed point, which means that there is a unique   such that  . Furthermore, if we start with a completely arbitrary point  , then the sequence

 

converges to  .

Proof:

First, we prove uniqueness of the fixed point. Assume   are both fixed points. Then

 .

Since  , this implies  .

Now we prove existence and simultaneously the claim about the convergence of the sequence  . For notation, we thus set   and if   is already defined, we set  . Then the sequence   is nothing else but the sequence  .

Let  . We claim that

 .

Indeed, this follows by induction on  . The case   is trivial, and if the claim is true for  , then  .

Hence, by the triangle inequality,

 .

The latter expression goes to zero as   and hence we are dealing with a Cauchy sequence. As we are in a complete metric space, it converges to a limit  . This limit further is a fixed point, as the continuity of   (  is Lipschitz continuous with constant  ) implies

 . 

A corollary to this important result is the following lemma, which shall be the main ingredient for the proof of the inverse function theorem:

Lemma:

Let   (  denoting the closed ball of radius  ) be a function which is Lipschitz continuous with Lipschitz constant less or equal   such that  . Then the function

 

is injective and  .

Proof:

First, we note that for   the function

 

is a strict contraction; this is due to

 .

Furthermore, it maps   to itself, since for  

 .

Hence, the Banach fixed-point theorem is applicable to  . Now   being a fixed point of   is equivalent to

 ,

and thus   follows from the existence of fixed points. Furthermore, if  , then

 

and hence  . Thus injectivity. 

The inverse function theorem edit

Theorem:

Let   be a function which is continuously differentiable in a neighbourhood   such that   is invertible. Then there exists an open set   with   such that   is a bijective function with an inverse   which is differentiable at   and satisfies

 .

Proof:

We first reduce to the case  ,   and  . Indeed, suppose for all those functions the theorem holds, and let now   be an arbitrary function satisfying the requirements of the theorem (where the differentiability is given at  ). We set

 

and obtain that   is differentiable at   with differential   and  ; the first property follows since we multiply both the function and the linear-affine approximation by   and only shift the function, and the second one is seen from inserting  . Hence, we obtain an inverse of   with it's differential at  , and if we now set

 ,

it can be seen that   is an inverse of   with all the required properties (which is a bit of a tedious exercise, but involves nothing more than the definitions).

Thus let   be a function such that  ,   is invertible at   and  . We define

 .

The differential of this function is zero (since taking the differential is linear and the differential of the function   is the identity). Since the function   is also continuously differentiable at a small neighbourhood of  , we find   such that

 

for all   and  . Since further  , the general mean-value theorem and Cauchy's inequality imply that for   and  ,

 

for suitable  . Hence,

  (triangle inequality),

and thus, we obtain that our preparatory lemma is applicable, and   is a bijection on  , whose image is contained within the open set  ; thus we may pick  , which is open due to the continuity of  .

Thus, the most important part of the theorem is already done. All that is left to do is to prove differentiability of   at  . Now we even prove the slightly stronger claim that the differential of   at   is given by the identity, although this would also follow from the chain rule once differentiability is proven.

Note now that the contraction identity for   implies the following bounds on  :

 .

The second bound follows from

 ,

and the first bound follows from

 .

Now for the differentiability at  . We have, by substitution of limits (as   is continuous and  ):

 

where the last expression converges to zero due to the differentiability of   at   with differential the identity, and the sandwhich criterion applied to the expressions

 

and

 . 

The implicit function theorem edit

Theorem:

Let   be a continuously differentiable function, and consider the set

 .

If we are given some   such that  , then we find   open with   and   such that

  and  ,

where   is open with respect to the subspace topology of  .

Furthermore,   is a differentiable function.

Proof:

We define a new function

 .

The differential of this function looks like this:

 

Since we assumed that  ,   is invertible, and hence the inverse function theorem implies the existence of a small open neighbourhood   containing   such that restricted to that neighbourhood   is itself invertible, with a differentiable inverse  , which is itself defined on an open set   containing  . Now set first

 ,

which is open with respect to the subspace topology of  , and then

 ,

the  -th component of  . We claim that   has the desired properties.

Indeed, we first note that  , since applying   leaves the first   components unchanged, and thus we get the identity by observing  . Let thus  . Then

 .

Furthermore, the set

 

is open with respect to the subspace topology on  . Indeed, we show

 .

For  , we first note that the set on the left hand side is in  , since all points in it are mapped to zero by  . Further,

 

and hence   is completed when applying  . For the other direction, let a point   in   be given, apply   to get

 

and hence  ; further

 

by applying   to both sides of the equation.

Now   is automatically differentiable as the component of a differentiable function. 


Informally, the above theorem states that given a set  , one can choose the first   coordinates as a "base" for a function, whose graph is precisely a local bit of that set.