Calculus/Directional derivatives and the gradient vector

Directional derivatives edit

Normally, a partial derivative of a function with respect to one of its variables, say, x_j, takes the derivative of that "slice" of that function parallel to the x_j'th axis.

More precisely, we can think of cutting a function f(x₁,...,x_n) in space along the x_j'th axis, with keeping everything but the x_j variable constant.

From the definition, we have the partial derivative at a point p of the function along this slice as

{\partial \mathbf {f}  \over \partial x_{j}}=\lim _{t\rightarrow 0}{\mathbf {f} (\mathbf {p} +t\mathbf {e} _{j})-\mathbf {f} (\mathbf {p} ) \over t}

provided this limit exists.

Instead of the basis vector, which corresponds to taking the derivative along that axis, we can pick a vector in any direction (which we usually take as being a unit vector), and we take the directional derivative of a function as

{\partial \mathbf {f}  \over \partial \mathbf {d} }=\lim _{t\rightarrow 0}{\mathbf {f} (\mathbf {p} +t\mathbf {d} )-\mathbf {f} (\mathbf {p} ) \over t}

where d is the direction vector.

If we want to calculate directional derivatives, calculating them from the limit definition is rather painful, but, we have the following: if f : Rⁿ → R is differentiable at a point p, |p|=1,

{\partial \mathbf {f}  \over \partial \mathbf {d} }=D_{\mathbf {p} }\mathbf {f} (\mathbf {d} )

There is a closely related formulation which we'll look at in the next section.

Gradient vectors edit

The partial derivatives of a scalar tell us how much it changes if we move along one of the axes. What if we move in a different direction?

We'll call the scalar f, and consider what happens if we move an infintesimal direction dr=(dx,dy,dz), using the chain rule.

\mathbf {df} =dx{\frac {\partial f}{\partial x}}+dy{\frac {\partial f}{\partial y}}+dz{\frac {\partial f}{\partial z}}

This is the dot product of dr with a vector whose components are the partial derivatives of f, called the gradient of f

$\operatorname {grad} \mathbf {f} =\nabla \mathbf {f} =\left({\frac {\partial \mathbf {f} (\mathbf {p} )}{\partial x_{1}}},\cdots ,{\frac {\partial \mathbf {f} (\mathbf {p} )}{\partial x_{n}}}\right)$

We can form directional derivatives at a point p, in the direction d then by taking the dot product of the gradient with d

{\partial \mathbf {f} (\mathbf {p} ) \over \partial \mathbf {d} }=\mathbf {d} \cdot \nabla \mathbf {f} (\mathbf {p} )

.

Notice that grad f looks like a vector multiplied by a scalar. This particular combination of partial derivatives is commonplace, so we abbreviate it to

\nabla =\left({\frac {\partial }{\partial x}},{\frac {\partial }{\partial y}},{\frac {\partial }{\partial z}}\right)

We can write the action of taking the gradient vector by writing this as an operator. Recall that in the one-variable case we can write d/dx for the action of taking the derivative with respect to x. This case is similar, but ∇ acts like a vector.

We can also write the action of taking the gradient vector as:

\nabla =\left({\frac {\partial }{\partial x_{1}}},{\frac {\partial }{\partial x_{2}}},\cdots {\frac {\partial }{\partial x_{n}}}\right)

Properties of the gradient vector edit

Geometry edit

Grad f(p) is a vector pointing in the direction of steepest slope of f. |grad f(p)| is the rate of change of that slope at that point.

For example, if we consider h(x, y)=x²+y². The level sets of h are concentric circles, centred on the origin, and

\nabla h=(h_{x},h_{y})=2(x,y)=2\mathbf {r}

grad h points directly away from the origin, at right angles to the contours.

Along a level set, (∇f)(p) is perpendicular to the level set {x|f(x)=f(p) at x=p}.

If dr points along the contours of f, where the function is constant, then df will be zero. Since df is a dot product, that means that the two vectors, df and grad f, must be at right angles, i.e. the gradient is at right angles to the contours.

Algebraic properties edit

Like d/dx, ∇ is linear. For any pair of constants, a and b, and any pair of scalar functions, f and g

{\frac {d}{dx}}(af+bg)=a{\frac {d}{dx}}f+b{\frac {d}{dx}}g\quad \nabla (af+bg)=a\nabla f+b\nabla g

Since it's a vector, we can try taking its dot and cross product with other vectors, and with itself.