Linear Algebra/Topic: Geometry of Linear Maps

< Linear Algebra
Linear Algebra
 ← Topic: Line of Best Fit Topic: Geometry of Linear Maps Topic: Markov Chains → 

The pictures below contrast and , which are nonlinear, with and , which are linear. Each of the four pictures shows the domain on the left mapped to the codomain on the right. Arrows trace out where each map sends , , , , and . Note how the nonlinear maps distort the domain in transforming it into the range. For instance, is further from than it is from — the map is spreading the domain out unevenly so that an interval near is spread apart more than is an interval near when they are carried over to the range.

Linalg exp function arrows.png


Linalg square function arrows.png

The linear maps are nicer, more regular, in that for each map all of the domain is spread by the same factor.

Linalg double function arrows.png


Linalg negate function arrows.png

The only linear maps from to are multiplications by a scalar. In higher dimensions more can happen. For instance, this linear transformation of , rotates vectors counterclockwise, and is not just a scalar multiplication.

Linalg rotate R2.png

The transformation of which projects vectors into the -plane is also not just a rescaling.

Linalg projection R3 to xz.png

Nonetheless, even in higher dimensions the situation isn't too complicated.

Below, we use the standard bases to represent each linear map by a matrix . Recall that any can be factored , where and are nonsingular and is a partial-identity matrix. Further, recall that nonsingular matrices factor into elementary matrices , which are matrices that are obtained from the identity with one Gaussian step

(, ). So if we understand the effect of a linear map described by a partial-identity matrix, and the effect of linear mapss described by the elementary matrices, then we will in some sense understand the effect of any linear map. (The pictures below stick to transformations of for ease of drawing, but the statements hold for maps from any to any .)

The geometric effect of the linear transformation represented by a partial-identity matrix is projection.

For the matrices, the geometric action of a transformation represented by such a matrix (with respect to the standard basis) is to stretch vectors by a factor of along the -th axis. This map stretches by a factor of along the -axis.

Linalg dilation in x.png

Note that if or if then the -th component goes the other way; here, toward the left.

Linalg dilation in x 2.png

Either of these is a dilation.

The action of a transformation represented by a permutation matrix is to interchange the -th and -th axes; this is a particular kind of reflection.

Linalg exchange x and y.png

In higher dimensions, permutations involving many axes can be decomposed into a combination of swaps of pairs of axes— see Problem 5.

The remaining case is that of matrices of the form . Recall that, for instance, that performs .

In the picture below, the vector with the first component of is affected less than the vector with the first component of is only higher than while is higher than .

Linalg lin map depends on coords.png

Any vector with a first component of would be affected as is ; it would be slid up by . And any vector with a first component of would be slid up , as was . That is, the transformation represented by affects vectors depending on their -th component.

Another way to see this same point is to consider the action of this map on the unit square. In the next picture, vectors with a first component of , like the origin, are not pushed vertically at all but vectors with a positive first component are slid up. Here, all vectors with a first component of — the entire right side of the square— is affected to the same extent. More generally, vectors on the same vertical line are slid up the same amount, namely, they are slid up by twice their first component. The resulting shape, a rhombus, has the same base and height as the square (and thus the same area) but the right angles are gone.

Linalg lin map depends on coords 2.png

For contrast the next picture shows the effect of the map represented by . In this case, vectors are affected according to their second component. The vector is slid horozontally by twice .

Linalg lin map depends on coords 3.png

Because of this action, this kind of map is called a skew.

With that, we have covered the geometric effect of the four types of components in the expansion , the partial-identity projection and the elementary 's. Since we understand its components, we in some sense understand the action of any . As an illustration of this assertion, recall that under a linear map, the image of a subspace is a subspace and thus the linear transformation represented by maps lines through the origin to lines through the origin. (The dimension of the image space cannot be greater than the dimension of the domain space, so a line can't map onto, say, a plane.) We will extend that to show that any line, not just those through the origin, is mapped by to a line. The proof is simply that the partial-identity projection and the elementary 's each turn a line input into a line output (verifying the four cases is Problem 6), and therefore their composition also preserves lines. Thus, by understanding its components we can understand arbitrary square matrices , in the sense that we can prove things about them.

An understanding of the geometric effect of linear transformations on is very important in mathematics. Here is a familiar application from calculus. On the left is a picture of the action of the nonlinear function . As at that start of this Topic, overall the geometric effect of this map is irregular in that at different domain points it has different effects (e.g., as the domain point goes from to , the associated range point at first decreases, then pauses instantaneously, and then increases).

Linalg nonlin function arrows.png

But in calculus we don't focus on the map overall, we focus instead on the local effect of the map.

At the derivative is , so that near we have .

That is, in a neighborhood of , in carrying the domain to the codomain this map causes it to grow by a factor of — it is, locally, approximately, a dilation.

The picture below shows a small interval in the domain carried over to an interval in the codomain that is three times as wide: .

Linalg nonlin locally lin.png

(When the above picture is drawn in the traditional cartesian way then the prior sentence about the rate of growth of is usually stated: the derivative gives the slope of the line tangent to the graph at the point .)

In higher dimensions, the idea is the same but the approximation is not just the -to- scalar multiplication case. Instead, for a function and a point , the derivative is defined to be the linear map best approximating how changes near . So the geometry studied above applies.

We will close this Topic by remarking how this point of view makes clear an often-misunderstood, but very important, result about derivatives: the derivative of the composition of two functions is computed by using the Chain Rule for combining their derivatives. Recall that (with suitable conditions on the two functions)

so that, for instance, the derivative of is . How does this combination arise? From this picture of the action of the composition.

Linalg composition for derivatives.png

The first map dilates the neighborhood of by a factor of

and the second map dilates some more, this time dilating a neighborhood of by a factor of

and as a result, the composition dilates by the product of these two.

In higher dimensions the map expressing how a function changes near a point is a linear map, and is expressed as a matrix. (So we understand the basic geometry of higher-dimensional derivatives; they are compositions of dilations, interchanges of axes, shears, and a projection). And, the Chain Rule just multiplies the matrices.

Thus, the geometry of linear maps is appealing both for its simplicity and for its usefulness.


Problem 1

Let   be the transformation that rotates vectors clockwise by   radians.

  1. Find the matrix   representing   with respect to the standard bases. Use Gauss' method to reduce   to the identity.
  2. Translate the row reduction to to a matrix equation   (the prior item shows both that   is similar to  , and that no column operations are needed to derive   from  ).
  3. Solve this matrix equation for  .
  4. Sketch the geometric effect matrix, that is, sketch how   is expressed as a combination of dilations, flips, skews, and projections (the identity is a trivial projection).
Problem 2

What combination of dilations, flips, skews, and projections produces a rotation counterclockwise by   radians?

Problem 3

What combination of dilations, flips, skews, and projections produces the map   represented with respect to the standard bases by this matrix?

Problem 4

Show that any linear transformation of   is the map that multiplies by a scalar  .

Problem 5

Show that for any permutation (that is, reordering)   of the numbers  , ...,  , the map


can be accomplished with a composition of maps, each of which only swaps a single pair of coordinates. Hint: it can be done by induction on  . (Remark: in the fourth chapter we will show this and we will also show that the parity of the number of swaps used is determined by  . That is, although a particular permutation could be accomplished in two different ways with two different numbers of swaps, either both ways use an even number of swaps, or both use an odd number.)

Problem 6

Show that linear maps preserve the linear structures of a space.

  1. Show that for any linear map from   to  , the image of any line is a line. The image may be a degenerate line, that is, a single point.
  2. Show that the image of any linear surface is a linear surface. This generalizes the result that under a linear map the image of a subspace is a subspace.
  3. Linear maps preserve other linear ideas. Show that linear maps preserve "betweeness": if the point   is between   and   then the image of   is between the image of   and the image of  .
Problem 7

Use a picture like the one that appears in the discussion of the Chain Rule to answer: if a function   has an inverse, what's the relationship between how the function — locally, approximately — dilates space, and how its inverse dilates space (assuming, of course, that it has an inverse)?


Linear Algebra
 ← Topic: Line of Best Fit Topic: Geometry of Linear Maps Topic: Markov Chains →