Real Analysis/Differentiation in Rn

We will first revise some important concepts of Linear Algebra that are of importance in Multivariate Analysis. The reader with no background in Linear Algebra is advised to refer the book Linear Algebra.

Vector Space

A set ${\mathcal {V}}$ is said to be a Vector Space over a field $F$ if and only if operations addition and scalar multiplication are defined over it so as to satisfy for all $\mathbf {v_{1}} ,\mathbf {v_{2}} ,\ldots \in {\mathcal {V}}$ and $c_{1},c_{2}\in F$

(i)Commutativity: $\mathbf {v_{1}} +\mathbf {v_{2}} =\mathbf {v_{2}} +\mathbf {v_{1}}$

(ii)Associativity: $(\mathbf {v_{1}} +\mathbf {v_{2}} )+\mathbf {v_{3}} =\mathbf {v_{1}} +(\mathbf {v_{2}} +\mathbf {v_{3}} )$

(iii)Identity:There exists $\mathbf {0} \in {\mathcal {V}}$ such that $\mathbf {v_{1}} +\mathbf {0} =\mathbf {v_{1}} =\mathbf {0} +\mathbf {v_{1}}$

(iv)Inverse:There exists $-\mathbf {v_{1}} \in {\mathcal {V}}$ such that $\mathbf {v_{1}} +(-\mathbf {v_{1}} )=\mathbf {0}$

(v): $c\mathbf {v_{1}} +c\mathbf {v_{2}} =c(\mathbf {v_{1}} +\mathbf {v_{2}} )$

(vi) $c_{1}\mathbf {v} +c_{2}\mathbf {v} =(c_{1}+c_{2})\mathbf {v}$

(vii) $c_{1}(c_{2}\mathbf {v_{1}} )=c_{2}(c_{1}\mathbf {v_{1}} )=(c_{1}c_{2})\mathbf {v_{1}}$

Members of a vector space are called "Vectors" and those of the field are called "Scalars". $\mathbb {R} ^{n}$ , the set of all polynomials etc. are examples of vector spaces

A set of linearly independant vectors that spans the vector space is said to be a Basis for the vector space.

Linear Transformations

Let $X,Y$ be vector spaces.

Let $T:X\to Y$

We say that $T$ is a Linear transformation if and only if for all $\mathbf {v_{1}} ,\mathbf {v_{2}} \in X$ ,

(i) $T(\mathbf {v_{1}} +\mathbf {v_{2}} )=T(\mathbf {v_{1}} )+T(\mathbf {v_{2}} )$

(ii) $T(c\mathbf {v_{1}} )=cT(\mathbf {v_{1}} )$

As we will see, there are two major ways to define a 'derivative' of a multivariable function. We first present the seemingly more straightforward way of using "Partial Derivatives".

Directional and Partial Derivatives

Let $\mathbf {f} :\mathbb {R} ^{n}\to \mathbb {R} ^{m}$

Let $\mathbf {a} ,\mathbf {y} \in \mathbb {R} ^{n}$

We say that $\mathbf {f}$ is differentiable at $\mathbf {a} \in \mathbb {R} ^{n}$ with respect to vector $\mathbf {y}$ if and only if there exists $\mathbf {L} \in \mathbb {R} ^{m}$ that satisfies

$\lim _{h\to 0}{\frac {\mathbf {f} (\mathbf {a} +h\mathbf {y} )-\mathbf {f} (\mathbf {a} )}{h}}=\mathbf {L}$

$\mathbf {L}$ is said to be the derivative of $\mathbf {f}$ at $\mathbf {a}$ with respect to $\mathbf {y}$ and is written as $\mathbf {f} '(\mathbf {a} ;\mathbf {y} )$

When $\mathbf {y}$ is a unit vector, the derivative is said to be a partial derivative. Here we will explicitly define partial derivatives and see some of their properties.

Let $f$ be a real multivariate function defined on an open subset $\Omega$ of $\mathbb {R} ^{n}$

f:\Omega \longrightarrow \mathbb {R}

.

Then the partial derivative at some parameter $(x_{1},...,x_{n})$ with respect to the coordinate $x_{i}$ is defined as the following limit

\lim _{h\rightarrow 0}{f(x_{1},\ldots ,x_{i}+h,\ldots ,x_{n})-f(x_{1},\ldots ,x_{i},\ldots ,x_{n}) \over h}={\partial f \over \partial x_{i}}

.

$f$ is said to be differentiable at this parameter $(x_{1},...,x_{n})$ if the difference $f(x_{1},...,x_{i}+h,...,x_{n})-f(x_{1},...,x_{i},...,x_{n})$ is equivalent up to first order in h to a linear form L (of h), that is

f(x_{1},...,x_{i}+h,...,x_{n})-f(x_{1},...,x_{i},...,x_{n})=L\times h+o(\|h\|).

The linear form L is then said to be the differential of $f$ at $(x_{1},...,x_{n})$ , and is written as $Df|_{(x_{1},\ldots ,x_{n})}$ or sometimes $\mathrm {d} f(x_{1},\ldots ,x_{n})$ .

In this case, where $f$ is differentiable at $(x_{1},\ldots ,x_{n})$ , by linearity we can write

\mathrm {d} f={\partial f \over \partial x_{1}}\mathrm {d} x_{1}+\ldots +{\partial f \over \partial x_{n}}\mathrm {d} x_{n}

$f$ is said to be continuously differentiable if its differential is defined at any parameter in its domain, and if the differential is varying continuously relative to the parameter $(x_{1},...,x_{n})$ , that is if it coordinates (as a linear form) $\partial f \over \partial x_{1}$ are varying continuously.

In case partial derivatives exists but $f$ is not differentiable, and sometimes not even continuous exempli gratia

f:(x,y)\mapsto {(xy)^{2} \over (x^{2}+y^{2})}

(and $f(0,0)=0$ ) we say that $f$ is separably differentiable.

Total Derivatives

The total derivative is important as it preserves some of the key properties of the single variable derivative, most notably the assertion differentiability implies continuity

Let $f:A\subseteq \mathbb {R} ^{n}\to \mathbb {R} ^{m}$

We say that $f$ is differentiable at $\mathbf {a} \in A$ if and only if there exists a linear transformation, $\mathbf {D} f(\mathbf {a} ):\mathbb {R} ^{n}\to \mathbb {R} ^{m}$ , called the derivative or total derivative of $f$ at $\mathbf {a}$ , such that

$\lim _{\|\mathbf {h} \|\to 0}{\frac {\|f(\mathbf {a} +\mathbf {h} )-f(\mathbf {a} )-\mathbf {D} f(\mathbf {a} )(\mathbf {h} )\|}{\|\mathbf {h} \|}}=0$

One should read $\mathbf {D} f(\mathbf {a} )(\mathbf {h} )$ as the linear transformation $\mathbf {D} f(\mathbf {a} )$ applied to the vector $\mathbf {h}$ . Sometimes it is customary to write this as $\mathbf {D} f(\mathbf {a} )\cdot (\mathbf {h} )$ .

Theorem

Suppose $A\subseteq \mathbb {R} ^{n}$ is an open set and $f:A\to \mathbb {R} ^{m}$ is differentiable on A. Think of writing $f$ in components so $f(x_{1},\ldots ,x_{n})=(f_{1}(x_{1},\ldots ,x_{n}),\ldots ,f_{m}(x_{1},\ldots ,x_{n}))$ . Then the partial derivatives ${\frac {\partial f_{j}}{\partial x_{i}}}$ exist, and the matrix representing the linear transformation $\mathbf {D} f(\mathbf {x} )$ with respect to the standard bases of $\mathbb {R} ^{n}$ and $\mathbb {R} ^{m}$ is given by the Jacobian Matrix:

${\begin{bmatrix}{\frac {\partial f_{1}}{\partial x_{1}}}&\cdots &{\frac {\partial f_{1}}{\partial x_{n}}}\\\vdots &\ddots &\vdots \\{\frac {\partial f_{m}}{\partial x_{1}}}&\cdots &{\frac {\partial f_{m}}{\partial x_{n}}}\end{bmatrix}}.$

evaluated at $\mathbf {x} =(x_{1},\ldots ,x_{n})$ .

NOTE: This theorem requires the function to be differentiable to begin with. It is a common mistake to assume that if the partial derivatives exist then this would imply that the function is differentiable because we can construct the Jacobian matrix. This however is completely false. Which brings us to the next theorem:

Theorem

Suppose $A\subseteq \mathbb {R} ^{n}$ is an open set and $f:A\to \mathbb {R} ^{m}$ . Think of writing $f$ in components so $f(x_{1},\ldots ,x_{n})=(f_{1}(x_{1},\ldots ,x_{n}),\ldots ,f_{m}(x_{1},\ldots ,x_{n}))$ . If ${\frac {\partial f_{j}}{\partial x_{i}}}$ exists and is continuous on $A$ for all $j\in \{1,\ldots ,m\}$ and for all $i\in \{1,\ldots ,n\}$ , then $f$ is differentiable on $A$ .

This theorem gives us a nice criteria for a function to be differentiable.