Linear Algebra/Print version/Part 2
This is the print version of Linear Algebra You won't see this message or any elements not part of the book's content when you print or preview this page. |
Chapter III - Maps Between Spaces
Section I - Isomorphisms
In the examples following the definition of a vector space we developed the intuition that some spaces are "the same" as others. For instance, the space of two-tall column vectors and the space of two-wide row vectors are not equal because their elements—column vectors and row vectors—are not equal, but we have the idea that these spaces differ only in how their elements appear. We will now make this idea precise.
This section illustrates a common aspect of a mathematical investigation. With the help of some examples, we've gotten an idea. We will next give a formal definition, and then we will produce some results backing our contention that the definition captures the idea. We've seen this happen already, for instance, in the first section of the Vector Space chapter. There, the study of linear systems led us to consider collections closed under linear combinations. We defined such a collection as a vector space, and we followed it with some supporting results.
Of course, that definition wasn't an end point, instead it led to new insights such as the idea of a basis. Here too, after producing a definition, and supporting it, we will get two surprises (pleasant ones). First, we will find that the definition applies to some unforeseen, and interesting, cases. Second, the study of the definition will lead to new ideas. In this way, our investigation will build a momentum.
1 - Definition and Examples
We start with two examples that suggest the right definition.
- Example 1.1
Consider the example mentioned above, the space of two-wide row vectors and the space of two-tall column vectors. They are "the same" in that if we associate the vectors that have the same components, e.g.,
then this correspondence preserves the operations, for instance this addition
and this scalar multiplication.
More generally stated, under the correspondence
both operations are preserved:
and
(all of the variables are real numbers).
- Example 1.2
Another two spaces we can think of as "the same" are , the space of quadratic polynomials, and . A natural correspondence is this.
The structure is preserved: corresponding elements add in a corresponding way
and scalar multiplication corresponds also.
- Definition 1.3
An isomorphism between two vector spaces and is a map that
- is a correspondence: is one-to-one and onto;[1]
- preserves structure: if then
(we write , read " is isomorphic to ", when such a map exists).
("Morphism" means map, so "isomorphism" means a map expressing sameness.)
- Example 1.4
The vector space of functions of is isomorphic to the vector space under this map.
We will check this by going through the conditions in the definition.
We will first verify condition 1, that the map is a correspondence between the sets underlying the spaces.
To establish that is one-to-one, we must prove that only when . If
then, by the definition of ,
from which we can conclude that and because column vectors are equal only when they have equal components. We've proved that implies that , which shows that is one-to-one.
To check that is onto we must check that any member of the codomain is the image of some member of the domain . But that's clear—any
is the image under of .
Next we will verify condition (2), that preserves structure.
This computation shows that preserves addition.
A similar computation shows that preserves scalar multiplication.
With that, conditions (1) and (2) are verified, so we know that is an isomorphism and we can say that the spaces are isomorphic .
- Example 1.5
Let be the space of linear combinations of three variables , , and , under the natural addition and scalar multiplication operations. Then is isomorphic to , the space of quadratic polynomials.
To show this we will produce an isomorphism map. There is more than one possibility; for instance, here are four.
The first map is the more natural correspondence in that it just carries the coefficients over. However, below we shall verify that the second one is an isomorphism, to underline that there are isomorphisms other than just the obvious one (showing that is an isomorphism is Problem 3).
To show that is one-to-one, we will prove that if then . The assumption that gives, by the definition of , that . Equal polynomials have equal coefficients, so , , and . Thus implies that and therefore is one-to-one.
The map is onto because any member of the codomain is the image of some member of the domain, namely it is the image of . For instance, is .
The computations for structure preservation are like those in the prior example. This map preserves addition
and scalar multiplication.
Thus is an isomorphism and we write .
We are sometimes interested in an isomorphism of a space with itself, called an automorphism. An identity map is an automorphism. The next two examples show that there are others.
- Example 1.6
A dilation map that multiplies all vectors by a nonzero scalar is an automorphism of .
A rotation or turning map that rotates all vectors through an angle is an automorphism.
A third type of automorphism of is a map that flips or reflects all vectors over a line through the origin.
See Problem 20.
- Example 1.7
Consider the space of polynomials of degree 5 or less and the map that sends a polynomial to . For instance, under this map and . This map is an automorphism of this space; the check is Problem 12.
This isomorphism of with itself does more than just tell us that the space is "the same" as itself. It gives us some insight into the space's structure. For instance, below is shown a family of parabolas, graphs of members of . Each has a vertex at , and the left-most one has zeroes at and , the next one has zeroes at and , etc.
Geometrically, the substitution of for in any function's argument shifts its graph to the right by one. Thus, and 's action is to shift all of the parabolas to the right by one. Notice that the picture before is applied is the same as the picture after is applied, because while each parabola moves to the right, another one comes in from the left to take its place. This also holds true for cubics, etc. So the automorphism gives us the insight that has a certain horizontal homogeneity; this space looks the same near as near .
As described in the preamble to this section, we will next
produce some results supporting the
contention that the definition of isomorphism above
captures our intuition of vector spaces being the same.
Of course the definition itself is persuasive: a vector space consists of two components, a set and some structure, and the definition simply requires that the sets correspond and that the structures correspond also. Also persuasive are the examples above. In particular, Example 1.1, which gives an isomorphism between the space of two-wide row vectors and the space of two-tall column vectors, dramatizes our intuition that isomorphic spaces are the same in all relevant respects. Sometimes people say, where , that " is just painted green"—any differences are merely cosmetic.
Further support for the definition, in case it is needed, is provided by the following results that, taken together, suggest that all the things of interest in a vector space correspond under an isomorphism. Since we studied vector spaces to study linear combinations, "of interest" means "pertaining to linear combinations". Not of interest is the way that the vectors are presented typographically (or their color!).
As an example, although the definition of isomorphism doesn't explicitly say that the zero vectors must correspond, it is a consequence of that definition.
- Lemma 1.8
An isomorphism maps a zero vector to a zero vector.
- Proof
Where is an isomorphism, fix any . Then .
The definition of isomorphism requires that sums of two vectors correspond and that so do scalar multiples. We can extend that to say that all linear combinations correspond.
- Lemma 1.9
For any map between vector spaces these statements are equivalent.
- preserves structure
- preserves linear combinations of two vectors
- preserves linear combinations of any finite number of vectors
- Proof
Since the implications and are clear, we need only show that . Assume statement 1. We will prove statement 3 by induction on the number of summands .
The one-summand base case, that , is covered by the assumption of statement 1.
For the inductive step assume that statement 3 holds whenever there are or fewer summands, that is, whenever , or , ..., or . Consider the -summand case. The first half of 1 gives
by breaking the sum along the final "". Then the inductive hypothesis lets us break up the -term sum.
Finally, the second half of statement 1 gives
when applied times.
In addition to adding to the intuition that the definition of isomorphism does indeed preserve the things of interest in a vector space, that lemma's second item is an especially handy way of checking that a map preserves structure.
We close with a summary. The material in this section augments the chapter on Vector Spaces. There, after giving the definition of a vector space, we informally looked at what different things can happen. Here, we defined the relation "" between vector spaces and we have argued that it is the right way to split the collection of vector spaces into cases because it preserves the features of interest in a vector space—in particular, it preserves linear combinations. That is, we have now said precisely what we mean by "the same", and by "different", and so we have precisely classified the vector spaces.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Verify, using Example 1.4 as a model, that the two correspondences given before the definition are isomorphisms.
- This exercise is recommended for all readers.
- Problem 2
For the map given by
Find the image of each of these elements of the domain.
Show that this map is an isomorphism.
- Problem 3
Show that the natural map from Example 1.5 is an isomorphism.
- This exercise is recommended for all readers.
- Problem 4
Decide whether each map is an isomorphism (if it is an isomorphism then prove it and if it isn't then state a condition that it fails to satisfy).
- given by
- given by
- given by
- given by
- Problem 5
Show that the map given by is one-to-one and onto.Is it an isomorphism?
- This exercise is recommended for all readers.
- Problem 6
Refer to Example 1.1. Produce two more isomorphisms (of course, that they satisfy the conditions in the definition of isomorphism must be verified).
- Problem 7
Refer to Example 1.2. Produce two more isomorphisms (and verify that they satisfy the conditions).
- This exercise is recommended for all readers.
- Problem 8
Show that, although is not itself a subspace of , it is isomorphic to the -plane subspace of .
- Problem 9
Find two isomorphisms between and .
- This exercise is recommended for all readers.
- Problem 10
For what is isomorphic to ?
- Problem 11
For what is isomorphic to ?
- Problem 12
Prove that the map in Example 1.7, from to given by , is a vector space isomorphism.
- Problem 13
Why, in Lemma 1.8, must there be a ? That is, why must be nonempty?
- Problem 14
Are any two trivial spaces isomorphic?
- Problem 15
In the proof of Lemma 1.9, what about the zero-summands case (that is, if is zero)?
- Problem 16
Show that any isomorphism has the form for some nonzero real number .
- This exercise is recommended for all readers.
- Problem 17
These prove that isomorphism is an equivalence relation.
- Show that the identity map is an isomorphism. Thus, any vector space is isomorphic to itself.
- Show that if is an isomorphism then so is its inverse . Thus, if is isomorphic to then also is isomorphic to .
- Show that a composition of isomorphisms is an isomorphism: if is an isomorphism and is an isomorphism then so also is . Thus, if is isomorphic to and is isomorphic to , then also is isomorphic to .
- Problem 18
Suppose that preserves structure. Show that is one-to-one if and only if the unique member of mapped by to is .
- Problem 19
Suppose that is an isomorphism. Prove that the set is linearly dependent if and only if the set of images is linearly dependent.
- This exercise is recommended for all readers.
- Problem 20
Show that each type of map from Example 1.6 is an automorphism.
- Dilation by a nonzero scalar .
- Rotation through an angle .
- Reflection over a line through the origin.
Hint. For the second and third items, polar coordinates are useful.
- Problem 21
Produce an automorphism of other than the identity map, and other than a shift map .
- Problem 22
- Show that a function is an automorphism if and only if it has the form for some .
- Let be an automorphism of such that . Find .
- Show that a function is an automorphism if and only if it has the form
- Let be an automorphism of with
- Problem 24
We show that isomorphisms can be tailored to fit in that, sometimes, given vectors in the domain and in the range we can produce an isomorphism associating those vectors.
- Let be a basis for so that any has a unique representation as , which we denote in this way.
- Show that this function is one-to-one and onto.
- Show that it preserves structure.
- Produce an isomorphism from to
that fits these specifications.
- Problem 25
Prove that a space is -dimensional if and only if it is isomorphic to . Hint. Fix a basis for the space and consider the map sending a vector over to its representation with respect to .
- Problem 26
(Requires the subsection on Combining Subspaces, which is optional.) Let and be vector spaces. Define a new vector space, consisting of the set along with these operations.
This is a vector space, the external direct sum of and .
- Check that it is a vector space.
- Find a basis for, and the dimension of, the external direct sum .
- What is the relationship among , , and ?
- Suppose that and are subspaces of a vector space such that (in this case we say that is the internal direct sum of and ). Show that the map given by
2 - Dimension Characterizes Isomorphism
In the prior subsection, after stating the definition of an isomorphism, we gave some results supporting the intuition that such a map describes spaces as "the same". Here we will formalize this intuition. While two spaces that are isomorphic are not equal, we think of them as almost equal— as equivalent. In this subsection we shall show that the relationship "is isomorphic to" is an equivalence relation.[2]
- Theorem 2.1
Isomorphism is an equivalence relation between vector spaces.
- Proof
We must prove that this relation has the three properties of being symmetric, reflexive, and transitive. For each of the three we will use item 2 of Lemma 1.9 and show that the map preserves structure by showing that it preserves linear combinations of two members of the domain.
To check reflexivity, that any space is isomorphic to itself, consider the identity map. It is clearly one-to-one and onto. The calculation showing that it preserves linear combinations is easy.
To check symmetry, that if is isomorphic to via some map then there is an isomorphism going the other way, consider the inverse map . As stated in the appendix, such an inverse function exists and it is also a correspondence. Thus we have reduced the symmetry issue to checking that, because preserves linear combinations, so also does . Assume that and , i.e., that and .
Finally, we must check transitivity, that if is isomorphic to via some map and if is isomorphic to via some map then also is isomorphic to . Consider the composition . The appendix notes that the composition of two correspondences is a correspondence, so we need only check that the composition preserves linear combinations.
Thus is an isomorphism.
As a consequence of that result, we know that the universe of vector spaces is partitioned into classes: every space is in one and only one isomorphism class.
|
|
- Theorem 2.2
Vector spaces are isomorphic if and only if they have the same dimension.
This follows from the next two lemmas.
- Lemma 2.3
If spaces are isomorphic then they have the same dimension.
- Proof
We shall show that an isomorphism of two spaces gives a correspondence between their bases. That is, where is an isomorphism and a basis for the domain is , then the image set is a basis for the codomain . (The other half of the correspondence— that for any basis of the inverse image is a basis for — follows on recalling that if is an isomorphism then is also an isomorphism, and applying the prior sentence to .)
To see that spans , fix any , note that is onto and so there is a with , and expand as a combination of basis vectors.
For linear independence of , if
then, since is one-to-one and so the only vector sent to is , we have that , implying that all of the 's are zero.
- Lemma 2.4
If spaces have the same dimension then they are isomorphic.
- Proof
To show that any two spaces of dimension are isomorphic, we can simply show that any one is isomorphic to . Then we will have shown that they are isomorphic to each other, by the transitivity of isomorphism (which was established in Theorem 2.1).
Let be -dimensional. Fix a basis for the domain . Consider the representation of the members of that domain with respect to the basis as a function from to
(it is well-defined[3] since every has one and only one such representation— see Remark 2.5 below).
This function is one-to-one because if
then
and so , ..., , and therefore the original arguments and are equal.
This function is onto; any -tall vector
is the image of some , namely .
Finally, this function preserves structure.
Thus the function is an isomorphism and thus any -dimensional space is isomorphic to the -dimensional space . Consequently, any two spaces with the same dimension are isomorphic.
- Remark 2.5
The parenthetical comment in that proof about the role played by the "one and only one representation" result requires some explanation. We need to show that (for a fixed ) each vector in the domain is associated by with one and only one vector in the codomain.
A contrasting example, where an association doesn't have this property, is illuminating. Consider this subset of , which is not a basis.
Call those four polynomials , ..., . If, mimicing above proof, we try to write the members of as , and associate with the four-tall vector with components , ..., then there is a problem. For, consider . The set spans the space , so there is at least one four-tall vector associated with . But is not linearly independent and so vectors do not have unique decompositions. In this case, both
and so there is more than one four-tall vector associated with .
That is, with input this association does not have a well-defined (i.e., single) output value.
Any map whose definition appears possibly ambiguous must be checked to see that it is well-defined. For in the above proof that check is Problem 11.
That ends the proof of Theorem 2.2. We say that the isomorphism classes are characterized by dimension because we can describe each class simply by giving the number that is the dimension of all of the spaces in that class.
This subsection's results give us a collection of representatives of the isomorphism classes.[4]
- Corollary 2.6
A finite-dimensional vector space is isomorphic to one and only one of the .
The proofs above pack many ideas into a small space. Through the rest of this chapter we'll consider these ideas again, and fill them out. For a taste of this, we will expand here on the proof of Lemma 2.4.
- Example 2.7
The space of matrices is isomorphic to . With this basis for the domain
the isomorphism given in the lemma, the representation map , simply carries the entries over.
One way to think of the map is: fix the basis for the domain and the basis for the codomain, and associate with , and with , etc. Then extend this association to all of the members of two spaces.
We say that the map has been extended linearly from the bases to the spaces.
We can do the same thing with different bases, for instance, taking this basis for the domain.
Associating corresponding members of and and extending linearly
gives rise to an isomorphism that is different than .
The prior map arose by changing the basis for the domain. We can also change the basis for the codomain. Starting with
associating with , etc., and then linearly extending that correspondence to all of the two spaces
gives still another isomorphism.
So there is a connection between the maps between spaces and bases for those spaces. Later sections will explore that connection.
We will close this section with a summary.
Recall that in the first chapter we defined two matrices as row equivalent if they can be derived from each other by elementary row operations (this was the meaning of same-ness that was of interest there). We showed that is an equivalence relation and so the collection of matrices is partitioned into classes, where all the matrices that are row equivalent fall together into a single class. Then, for insight into which matrices are in each class, we gave representatives for the classes, the reduced echelon form matrices.
In this section, except that the appropriate notion of same-ness here is vector space isomorphism, we have followed much the same outline. First we defined isomorphism, saw some examples, and established some properties. Then we showed that it is an equivalence relation, and now we have a set of class representatives, the real vector spaces , , etc.
|
|
As before, the list of representatives helps us to understand the partition. It is simply a classification of spaces by dimension.
In the second chapter, with the definition of vector spaces, we seemed to have opened up our studies to many examples of new structures besides the familiar 's. We now know that isn't the case. Any finite-dimensional vector space is actually "the same" as a real space. We are thus considering exactly the structures that we need to consider.
The rest of the chapter fills out the work in this section. In particular, in the next section we will consider maps that preserve structure, but are not necessarily correspondences.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Decide if the spaces are isomorphic.
- ,
- ,
- ,
- ,
- ,
- Answer
Each pair of spaces is isomorphic if and only if the two have the same dimension. We can, when there is an isomorphism, state a map, but it isn't strictly necessary.
- No, they have different dimensions.
- No, they have different dimensions.
- Yes, they have the same dimension.
One isomorphism is this.
- Yes, they have the same dimension. This is an isomorphism.
- Yes, both have dimension .
- This exercise is recommended for all readers.
- Problem 2
Consider the isomorphism where . Find the image of each of these elements of the domain.
- ;
- ;
- Answer
- This exercise is recommended for all readers.
- Problem 3
Show that if then .
- Answer
They have different dimensions.
- This exercise is recommended for all readers.
- Problem 4
Is ?
- Answer
Yes, both are -dimensional.
- This exercise is recommended for all readers.
- Problem 5
Are any two planes through the origin in isomorphic?
- Answer
Yes, any two (nondegenerate) planes are both two-dimensional vector spaces.
- Problem 6
Find a set of equivalence class representatives other than the set of 's.
- Answer
There are many answers, one is the set of (taking to be the trivial vector space).
- Problem 7
True or false: between any -dimensional space and there is exactly one isomorphism.
- Answer
False (except when ). For instance, if is an isomorphism then multiplying by any nonzero scalar, gives another, different, isomorphism. (Between trivial spaces the isomorphisms are unique; the only map possible is .)
- Problem 8
Can a vector space be isomorphic to one of its (proper) subspaces?
- Answer
No. A proper subspace has a strictly lower dimension than it's superspace; if is a proper subspace of then any linearly independent subset of must have fewer than members or else that set would be a basis for , and wouldn't be proper.
- This exercise is recommended for all readers.
- Problem 9
This subsection shows that for any isomorphism, the inverse map is also an isomorphism. This subsection also shows that for a fixed basis of an -dimensional vector space , the map is an isomorphism. Find the inverse of this map.
- Answer
Where , the inverse is this.
- This exercise is recommended for all readers.
- Problem 10
Prove these facts about matrices.
- The row space of a matrix is isomorphic to the column space of its transpose.
- The row space of a matrix is isomorphic to its column space.
- Answer
All three spaces have dimension equal to the rank of the matrix.
- Problem 11
Show that the function from Theorem 2.2 is well-defined.
- Answer
We must show that if then . So suppose that . Each vector in a vector space (here, the domain space) has a unique representation as a linear combination of basis vectors, so we can conclude that , ..., . Thus,
and so the function is well-defined.
- Problem 12
Is the proof of Theorem 2.2 valid when ?
- Answer
Yes, because a zero-dimensional space is a trivial space.
- Problem 13
For each, decide if it is a set of isomorphism class representatives.
- Answer
- No, this collection has no spaces of odd dimension.
- Yes, because .
- No, for instance, .
- Problem 14
Let be a correspondence between vector spaces and (that is, a map that is one-to-one and onto). Show that the spaces and are isomorphic via if and only if there are bases and such that corresponding vectors have the same coordinates: .
- Answer
One direction is easy: if the two are isomorphic via then for any basis , the set is also a basis (this is shown in Lemma 2.3). The check that corresponding vectors have the same coordinates: is routine.
For the other half, assume that there are bases such that corresponding vectors have the same coordinates with respect to those bases. Because is a correspondence, to show that it is an isomorphism, we need only show that it preserves structure. Because , the map preserves structure if and only if representations preserve addition: and scalar multiplication: The addition calculation is this: , and the scalar multiplication calculation is similar.
- Problem 15
Consider the isomorphism .
- Vectors in a real space are orthogonal if and only if their dot product is zero. Give a definition of orthogonality for polynomials.
- The derivative of a member of is in . Give a definition of the derivative of a vector in .
- Answer
- Pulling the definition back from to gives that is orthogonal to if and only if .
- A natural definition is this.
- This exercise is recommended for all readers.
- Problem 16
Does every correspondence between bases, when extended to the spaces, give an isomorphism?
- Answer
Yes.
Assume that is a vector space with basis and that is another vector space such that the map is a correspondence. Consider the extension of .
The map is an isomorphism.
First, is well-defined because every member of has one and only one representation as a linear combination of elements of .
Second, is one-to-one because every member of has only one representation as a linear combination of elements of . That map is onto because every member of has at least one representation as a linear combination of members of .
Finally, preservation of structure is routine to check. For instance, here is the preservation of addition calculation.
Preservation of scalar multiplication is similar.
- Problem 17
(Requires the subsection on Combining Subspaces, which is optional.) Suppose that and that is isomorphic to the space under the map . Show that .
- Answer
Because and is one-to-one we have that . To finish, count the dimensions: , as required.
- Problem 18
- Show that this is not a well-defined function from the rational numbers to the integers: with each fraction, associate the value of its numerator.
- Answer
Rational numbers have many representations, e.g., , and the numerators can vary among representations.
Footnotes
- ↑ More information on one-to-one and onto maps is in the appendix.
- ↑ More information on equivalence relations is in the appendix.
- ↑ More information on well-definedness is in the appendix.
- ↑ More information on equivalence class representatives is in the appendix.
Section II - Homomorphisms
The definition of isomorphism has two conditions. In this section we will consider the second one, that the map must preserve the algebraic structure of the space. We will focus on this condition by studying maps that are required only to preserve structure; that is, maps that are not required to be correspondences.
Experience shows that this kind of map is tremendously useful in the study of vector spaces. For one thing, as we shall see in the second subsection below, while isomorphisms describe how spaces are the same, these maps describe how spaces can be thought of as alike.
1 - Definition
- Definition 1.1
A function between vector spaces that preserves the operations of addition
if then
and scalar multiplication
if and then
is a homomorphism or linear map.
- Example 1.2
The projection map
is a homomorphism.
It preserves addition
and scalar multiplication.
This map is not an isomorphism since it is not one-to-one. For instance, both and in are mapped to the zero vector in .
- Example 1.3
Of course, the domain and codomain might be other than spaces of column vectors. Both of these are homomorphisms; the verifications are straightforward.
- given by
- given by
- Example 1.4
Between any two spaces there is a zero homomorphism, mapping every vector in the domain to the zero vector in the codomain.
- Example 1.5
These two suggest why we use the term "linear map".
- The map given by
- The first of these two maps is linear while the second is not.
What distinguishes the homomorphisms is that the coordinate functions are linear combinations of the arguments. See also Problem 7.
Obviously, any isomorphism is a homomorphism— an isomorphism is a homomorphism that is also a correspondence. So, one way to think of the "homomorphism" idea is that it is a generalization of "isomorphism", motivated by the observation that many of the properties of isomorphisms have only to do with the map's structure preservation property and not to do with it being a correspondence. As examples, these two results from the prior section do not use one-to-one-ness or onto-ness in their proof, and therefore apply to any homomorphism.
- Lemma 1.6
A homomorphism sends a zero vector to a zero vector.
- Lemma 1.7
Each of these is a necessary and sufficient condition for to be a homomorphism.
- for any and
- for any and
Part 1 is often used to check that a function is linear.
- Example 1.8
The map given by
satisfies 1 of the prior result
and so it is a homomorphism.
However, some of the results that we have seen for isomorphisms fail to hold for homomorphisms in general. Consider the theorem that an isomorphism between spaces gives a correspondence between their bases. Homomorphisms do not give any such correspondence; Example 1.2 shows that there is no such correspondence, and another example is the zero map between any two nontrivial spaces. Instead, for homomorphisms a weaker but still very useful result holds.
- Theorem 1.9
A homomorphism is determined by its action on a basis. That is, if is a basis of a vector space and are (perhaps not distinct) elements of a vector space then there exists a homomorphism from to sending to , ..., and to , and that homomorphism is unique.
- Proof
We will define the map by associating with , etc., and then extending linearly to all of the domain. That is, where , the map is given by . This is well-defined because, with respect to the basis, the representation of each domain vector is unique.
This map is a homomorphism since it preserves linear combinations; where and , we have this.
And, this map is unique since if is another homomorphism such that for each then and agree on all of the vectors in the domain.
Thus, and are the same map.
- Example 1.10
This result says that we can construct a homomorphism by fixing a basis for the domain and specifying where the map sends those basis vectors. For instance, if we specify a map that acts on the standard basis in this way
then the action of on any other member of the domain is also specified. For instance, the value of on this argument
is a direct consequence of the value of on the basis vectors.
Later in this chapter we shall develop a scheme, using matrices, that is convienent for computations like this one.
Just as the isomorphisms of a space with itself are useful and interesting, so too are the homomorphisms of a space with itself.
- Definition 1.11
A linear map from a space into itself is a linear transformation.
- Remark 1.12
In this book we use "linear transformation" only in the case where the codomain equals the domain, but it is widely used in other texts as a general synonym for "homomorphism".
- Example 1.13
The map on that projects all vectors down to the -axis
is a linear transformation.
- Example 1.14
The derivative map
is a linear transformation, as this result from calculus notes: .
- Example 1.15
- The matrix transpose map
is a linear transformation of . Note that this transformation is one-to-one and onto, and so in fact it is an automorphism.
We finish this subsection about maps by recalling that we can linearly combine maps. For instance, for these maps from to itself
the linear combination is also a map from to itself.
- Lemma 1.16
For vector spaces and , the set of linear functions from to is itself a vector space, a subspace of the space of all functions from to . It is denoted .
- Proof
This set is non-empty because it contains the zero homomorphism. So to show that it is a subspace we need only check that it is closed under linear combinations. Let be linear. Then their sum is linear
and any scalar multiple is also linear.
Hence is a subspace.
We started this section by isolating the structure preservation property of isomorphisms. That is, we defined homomorphisms as a generalization of isomorphisms. Some of the properties that we studied for isomorphisms carried over unchanged, while others were adapted to this more general setting.
It would be a mistake, though, to view this new notion of homomorphism as derived from, or somehow secondary to, that of isomorphism. In the rest of this chapter we shall work mostly with homomorphisms, partly because any statement made about homomorphisms is automatically true about isomorphisms, but more because, while the isomorphism concept is perhaps more natural, experience shows that the homomorphism concept is actually more fruitful and more central to further progress.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Decide if each is linear.
- Answer
- Yes. The verification is straightforward.
- Yes. The verification is easy.
- No. An example of an addition that is not respected is this.
- Yes. The verification is straightforward.
- This exercise is recommended for all readers.
- Problem 2
Decide if each map is linear.
- Answer
For each, we must either check that linear combinations are preserved, or give an example of a linear combination that is not.
- Yes. The check that it preserves combinations is routine.
- No. For instance, not preserved is multiplication by the scalar .
- Yes. This is the check that it preserves combinations of two members of the domain.
- No.
An example of a combination that is not preserved is this.
- This exercise is recommended for all readers.
- Problem 3
Show that these two maps are homomorphisms.
- given by maps to
- given by maps to
Are these maps inverse to each other?
- Answer
The check that each is a homomorphisms is routine. Here is the check for the differentiation map.
(An alternate proof is to simply note that this is a property of differentiation that is familar from calculus.)
These two maps are not inverses as this composition does not act as the identity map on this element of the domain.
- Problem 4
Is (perpendicular) projection from to the -plane a homomorphism? Projection to the -plane? To the -axis? The -axis? The -axis? Projection to the origin?
- Answer
Each of these projections is a homomorphism. Projection to the -plane and to the -plane are these maps.
Projection to the -axis, to the -axis, and to the -axis are these maps.
And projection to the origin is this map.
Verification that each is a homomorphism is straightforward. (The last one, of course, is the zero transformation on .)
- Problem 5
Show that, while the maps from Example 1.3 preserve linear operations, they are not isomorphisms.
- Answer
The first is not onto; for instance, there is no polynomial that is sent the constant polynomial . The second is not one-to-one; both of these members of the domain
are mapped to the same member of the codomain, .
- Problem 6
Is an identity map a linear transformation?
- Answer
Yes; in any space .
- This exercise is recommended for all readers.
- Problem 7
Stating that a function is "linear" is different than stating that its graph is a line.
- The function given by has a graph that is a line. Show that it is not a linear function.
- The function given by
- Answer
- This map does not preserve structure since , while .
- The check is routine.
- This exercise is recommended for all readers.
- Problem 8
Part of the definition of a linear function is that it respects addition. Does a linear function respect subtraction?
- Answer
Yes. Where is linear, .
- Problem 9
Assume that is a linear transformation of and that is a basis of . Prove each statement.
- If for each basis vector then is the zero map.
- If for each basis vector then is the identity map.
- If there is a scalar such that for each basis vector then for all vectors in .
- Answer
- Let be represented with respect to the basis as . Then .
- This argument is similar to the prior one. Let be represented with respect to the basis as . Then .
- As above, only .
- This exercise is recommended for all readers.
- Problem 10
Consider the vector space where vector addition and scalar multiplication are not the ones inherited from but rather are these: is the product of and , and is the -th power of . (This was shown to be a vector space in an earlier exercise.) Verify that the natural logarithm map is a homomorphism between these two spaces. Is it an isomorphism?
- Answer
That it is a homomorphism follows from the familiar rules that the logarithm of a product is the sum of the logarithms and that the logarithm of a power is the multiple of the logarithm . This map is an isomorphism because it has an inverse, namely, the exponential map, so it is a correspondence, and therefore it is an isomorphism.
- This exercise is recommended for all readers.
- Problem 11
Consider this transformation of .
Find the image under this map of this ellipse.
- Answer
Where and , the image set is
the unit circle in the -plane.
- This exercise is recommended for all readers.
- Problem 12
Imagine a rope wound around the earth's equator so that it fits snugly (suppose that the earth is a sphere). How much extra rope must be added to raise the circle to a constant six feet off the ground?
- Answer
The circumference function is linear. Thus we have . Observe that it takes the same amount of extra rope to raise the circle from tightly wound around a basketball to six feet above that basketball as it does to raise it from tightly wound around the earth to six feet above the earth.
- This exercise is recommended for all readers.
- Problem 13
Verify that this map
is linear. Generalize.
- Answer
Verifying that it is linear is routine.
The natural guess at a generalization is that for any fixed the map is linear. This statement is true. It follows from properties of the dot product we have seen earlier: and . (The natural guess at a generalization of this generalization, that the map from to whose action consists of taking the dot product of its argument with a fixed vector is linear, is also true.)
- Problem 14
Show that every homomorphism from to acts via multiplication by a scalar. Conclude that every nontrivial linear transformation of is an isomorphism. Is that true for transformations of ? ?
- Answer
Let be linear. A linear map is determined by its action on a basis, so fix the basis for . For any we have that and so acts on any argument by multiplying it by the constant . If is not zero then the map is a correspondence— its inverse is division by — so any nontrivial transformation of is an isomorphism.
This projection map is an example that shows that not every transformation of acts via multiplication by a constant when , including when .
- Problem 15
- Show that for any scalars this map is a homomorphism.
- Show that for each , the -th derivative operator is a linear transformation of . Conclude that for any scalars this map is a linear transformation of that space.
- Answer
- Where and are scalars, we have this.
- Each power of the derivative operator is linear because of these rules familiar from calculus.
- Problem 16
Lemma 1.16 shows that a sum of linear functions is linear and that a scalar multiple of a linear function is linear. Show also that a composition of linear functions is linear.
- Answer
(This argument has already appeared, as part of the proof that isomorphism is an equivalence.) Let and be linear. For any and scalars combinations are preserved.
- This exercise is recommended for all readers.
- Problem 17
Where is linear, suppose that , ..., for some vectors , ..., from .
- If the set of 's is independent, must the set of 's also be independent?
- If the set of 's is independent, must the set of 's also be independent?
- If the set of 's spans , must the set of 's span ?
- If the set of 's spans , must the set of 's span ?
- Answer
- Yes. The set of 's cannot be linearly independent if the set of 's is linearly dependent because any nontrivial relationship in the domain would give a nontrivial relationship in the range .
- Not necessarily. For instance, the transformation of given by
- Not necessarily. An example is the projection map
- Not necessarily. For instance, the injection map sends the standard basis for the domain to a set that does not span the codomain. (Remark. However, the set of 's does span the range. A proof is easy.)
- Problem 18
Generalize Example 1.15 by proving that the matrix transpose map is linear. What is the domain and codomain?
- Answer
Recall that the entry in row and column of the transpose of is the entry from row and column of . Now, the check is routine.
The domain is while the codomain is .
- Problem 19
- Where , the line segment connecting them is defined to be the set . Show that the image, under a homomorphism , of the segment between and is the segment between and .
- A subset of is convex if, for any two points in that set, the line segment joining them lies entirely in that set. (The inside of a sphere is convex while the skin of a sphere is not.) Prove that linear maps from to preserve the property of set convexity.
- Answer
- For any homomorphism we have
- We must show that if a subset of the domain is convex then its image, as a subset of the range, is also convex. Suppose that is convex and consider its image . To show is convex we must show that for any two of its members, and , the line segment connecting them
- This exercise is recommended for all readers.
- Problem 20
Let be a homomorphism.
- Show that the image under of a line in is a (possibly degenerate) line in .
- What happens to a -dimensional linear surface?
- Answer
- For , the line through with direction is the set . The image under of that line is the line through with direction . If is the zero vector then this line is degenerate.
- A -dimensional linear surface in maps to a (possibly degenerate) -dimensional linear surface in . The proof is just like that the one for the line.
- Problem 21
Prove that the restriction of a homomorphism to a subspace of its domain is another homomorphism.
- Answer
Suppose that is a homomorphism and suppose that is a subspace of . Consider the map defined by . (The only difference between and is the difference in domain.) Then this new map is linear: .
- Problem 22
Assume that is linear.
- Show that the rangespace of this map is a subspace of the codomain .
- Show that the nullspace of this map is a subspace of the domain .
- Show that if is a subspace of the domain then its image is a subspace of the codomain . This generalizes the first item.
- Generalize the second item.
- Answer
This will appear as a lemma in the next subsection.
- The range is nonempty because is nonempty. To finish we need to show that it is closed under combinations. A combination of range vectors has the form, where ,
- The nullspace is nonempty since it contains , as maps to . It is closed under linear combinations because, where are elements of the inverse image set , for
- This image of nonempty because is nonempty. For closure under combinations, where ,
- The natural generalization is that the inverse image of a subspace of is a subspace.
Suppose that is a subspace of . Note that so the set is not empty. To show that this set is closed under combinations, let be elements of such that , ..., and note that
- Problem 23
Consider the set of isomorphisms from a vector space to itself. Is this a subspace of the space of homomorphisms from the space to itself?
- Answer
No; the set of isomorphisms does not contain the zero map (unless the space is trivial).
- Problem 24
Does Theorem 1.9 need that is a basis? That is, can we still get a well-defined and unique homomorphism if we drop either the condition that the set of 's be linearly independent, or the condition that it span the domain?
- Answer
If doesn't span the space then the map needn't be unique. For instance, if we try to define a map from to itself by specifying only that is sent to itself, then there is more than one homomorphism possible; both the identity map and the projection map onto the first component fit this condition.
If we drop the condition that is linearly independent then we risk an inconsistent specification (i.e, there could be no such map). An example is if we consider , and try to define a map from to itself that sends to itself, and sends both and to . No homomorphism can satisfy these three conditions.
- Problem 25
Let be a vector space and assume that the maps are linear.
- Define a map whose component functions are the given linear ones.
- Does the converse hold— is any linear map from to made up of two linear component maps to ?
- Generalize.
- Answer
- Briefly, the check of linearity is this.
- Yes. Let and be the projections
- In general, a map from a vector space to an is linear if and only if each of the component functions is linear. The verification is as in the prior item.
2 - Rangespace and Nullspace
Isomorphisms and homomorphisms both preserve structure. The difference is that homomorphisms needn't be onto and needn't be one-to-one. This means that homomorphisms are a more general kind of map, subject to fewer restrictions than isomorphisms. We will examine what can happen with homomorphisms that is prevented by the extra restrictions satisfied by isomorphisms.
We first consider the effect of dropping the onto requirement, of not requiring as part of the definition that a homomorphism be onto its codomain. For instance, the injection map
is not an isomorphism because it is not onto. Of course, being a function, a homomorphism is onto some set, namely its range; the map is onto the -plane subset of .
- Lemma 2.1
Under a homomorphism, the image of any subspace of the domain is a subspace of the codomain. In particular, the image of the entire space, the range of the homomorphism, is a subspace of the codomain.
- Proof
Let be linear and let be a subspace of the domain . The image is a subset of the codomain . It is nonempty because is nonempty and thus to show that is a subspace of we need only show that it is closed under linear combinations of two vectors. If and are members of then is also a member of because it is the image of from .
- Definition 2.2
The rangespace of a homomorphism is
sometimes denoted . The dimension of the rangespace is the map's rank.
(We shall soon see the connection between the rank of a map and the rank of a matrix.)
- Example 2.3
Recall that the derivative map given by is linear. The rangespace is the set of quadratic polynomials . Thus, the rank of this map is three.
- Example 2.4
With this homomorphism
an image vector in the range can have any constant term, must have an coefficient of zero, and must have the same coefficient of as of . That is, the rangespace is and so the rank is two.
The prior result shows that, in passing from the definition of isomorphism to the more general definition of homomorphism, omitting the "onto" requirement doesn't make an essential difference. Any homomorphism is onto its rangespace.
However, omitting the "one-to-one" condition does make a difference. A homomorphism may have many elements of the domain that map to one element of the codomain. Below is a "bean " sketch of a many-to-one map between sets.[1] It shows three elements of the codomain that are each the image of many members of the domain.

Recall that for any function , the set of elements of that are mapped to is the inverse image . Above, the three sets of many elements on the left are inverse images.
- Example 2.5
Consider the projection
which is a homomorphism that is many-to-one. In this instance, an inverse image set is a vertical line of vectors in the domain.

- Example 2.6
This homomorphism
is also many-to-one; for a fixed , the inverse image

is the set of plane vectors whose components add to .
The above examples have only to do with the fact that we are considering functions, specifically, many-to-one functions. They show the inverse images as sets of vectors that are related to the image vector . But these are more than just arbitrary functions, they are homomorphisms; what do the two preservation conditions say about the relationships?
In generalizing from isomorphisms to homomorphisms by dropping the one-to-one condition, we lose the property that we've stated intuitively as: the domain is "the same as" the range. That is, we lose that the domain corresponds perfectly to the range in a one-vector-by-one-vector way.
What we shall keep, as the examples below illustrate, is that a homomorphism describes a way in which the domain is "like", or "analogous to", the range.
- Example 2.7
We think of as being like , except that vectors have an extra component. That is, we think of the vector with components , , and as like the vector with components and . In defining the projection map , we make precise which members of the domain we are thinking of as related to which members of the codomain.
Understanding in what way the preservation conditions in the definition of homomorphism show that the domain elements are like the codomain elements is easiest if we draw as the -plane inside of . (Of course, is a set of two-tall vectors while the -plane is a set of three-tall vectors with a third component of zero, but there is an obvious correspondence.) Then, is the "shadow" of in the plane and the preservation of addition property says that
![]() | ![]() | ![]() | ||
above | plus | above | equals | above |
Briefly, the shadow of a sum equals the sum of the shadows . (Preservation of scalar multiplication has a similar interpretation.)
Redrawing by separating the two spaces, moving the codomain to the right, gives an uglier picture but one that is more faithful to the "bean" sketch.

Again in this drawing, the vectors that map to lie in the domain in a vertical line (only one such vector is shown, in gray). Call any such member of this inverse image a " vector". Similarly, there is a vertical line of " vectors" and a vertical line of " vectors". Now, has the property that if and then . This says that the vector classes add, in the sense that any vector plus any vector equals a vector, (A similar statement holds about the classes under scalar multiplication.)
Thus, although the two spaces and are not isomorphic, describes a way in which they are alike: vectors in add as do the associated vectors in — vectors add as their shadows add.
- Example 2.8
A homomorphism can be used to express an analogy between spaces that is more subtle than the prior one. For the map
from Example 2.6 fix two numbers in the range . A that maps to has components that add to , that is, the inverse image is the set of vectors with endpoint on the diagonal line . Call these the " vectors". Similarly, we have the " vectors" and the " vectors". Then the addition preservation property says that
![]() | ![]() | ![]() | ||
a " vector" | plus | a " vector" | equals | a " vector". |
Restated, if a vector is added to a vector then the result is mapped by to a vector. Briefly, the image of a sum is the sum of the images. Even more briefly, . (The preservation of scalar multiplication condition has a similar restatement.)
- Example 2.9
The inverse images can be structures other than lines. For the linear map
the inverse image sets are planes , , etc., perpendicular to the -axis.

We won't describe how every homomorphism that we will use is an analogy because the formal sense that we make of "alike in that ..." is "a homomorphism exists such that ...". Nonetheless, the idea that a homomorphism between two spaces expresses how the domain's vectors fall into classes that act like the range's vectors is a good way to view homomorphisms.
Another reason that we won't treat all of the homomorphisms that we see as above is that many vector spaces are hard to draw (e.g., a space of polynomials). However, there is nothing bad about gaining insights from those spaces that we are able to draw, especially when those insights extend to all vector spaces. We derive two such insights from the three examples 2.7 , 2.8, and 2.9.
First, in all three examples, the inverse images are lines or planes, that is, linear surfaces. In particular, the inverse image of the range's zero vector is a line or plane through the origin— a subspace of the domain.
- Lemma 2.10
For any homomorphism, the inverse image of a subspace of the range is a subspace of the domain. In particular, the inverse image of the trivial subspace of the range is a subspace of the domain.
- Proof
Let be a homomorphism and let be a subspace of the rangespace . Consider , the inverse image of the set . It is nonempty because it contains , since , which is an element , as is a subspace. To show that is closed under linear combinations, let and be elements, so that and are elements of , and then is also in the inverse image because is a member of the subspace .
- Definition 2.11
The nullspace or kernel of a linear map is the inverse image of
The dimension of the nullspace is the map's nullity.

- Example 2.12
The map from Example 2.3 has this nullspace .
Now for the second insight from the above pictures. In Example 2.7, each of the vertical lines is squashed down to a single point— , in passing from the domain to the range, takes all of these one-dimensional vertical lines and "zeroes them out", leaving the range one dimension smaller than the domain. Similarly, in Example 2.8, the two-dimensional domain is mapped to a one-dimensional range by breaking the domain into lines (here, they are diagonal lines), and compressing each of those lines to a single member of the range. Finally, in Example 2.9, the domain breaks into planes which get "zeroed out", and so the map starts with a three-dimensional domain but ends with a one-dimensional range— this map "subtracts" two from the dimension. (Notice that, in this third example, the codomain is two-dimensional but the range of the map is only one-dimensional, and it is the dimension of the range that is of interest.)
- Theorem 2.14
A linear map's rank plus its nullity equals the dimension of its domain.
- Proof
Let be linear and let be a basis for the nullspace. Extend that to a basis for the entire domain. We shall show that is a basis for the rangespace. Then counting the size of these bases gives the result.
To see that is linearly independent, consider the equation . This gives that and so is in the nullspace of . As is a basis for this nullspace, there are scalars satisfying this relationship.
But is a basis for so each scalar equals zero. Therefore is linearly independent.
To show that spans the rangespace, consider and write as a linear combination of members of . This gives and since , ..., are in the nullspace, we have that . Thus, is a linear combination of members of , and so spans the space.
- Example 2.15
Where is
the rangespace and nullspace are
and so the rank of is two while the nullity is one.
- Example 2.16
If is the linear transformation then the range is , and so the rank of is one and the nullity is zero.
- Corollary 2.17
The rank of a linear map is less than or equal to the dimension of the domain. Equality holds if and only if the nullity of the map is zero.
We know that an isomorphism exists between two spaces if and only if their dimensions are equal. Here we see that for a homomorphism to exist, the dimension of the range must be less than or equal to the dimension of the domain. For instance, there is no homomorphism from onto . There are many homomorphisms from into , but none is onto all of three-space.
The rangespace of a linear map can be of dimension strictly less than the dimension of the domain (Example 2.3's derivative transformation on has a domain of dimension four but a range of dimension three). Thus, under a homomorphism, linearly independent sets in the domain may map to linearly dependent sets in the range (for instance, the derivative sends to ). That is, under a homomorphism, independence may be lost. In contrast, dependence stays.
- Lemma 2.18
Under a linear map, the image of a linearly dependent set is linearly dependent.
- Proof
Suppose that , with some nonzero. Then, because and because , we have that with some nonzero .
When is independence not lost? One obvious sufficient condition is when the homomorphism is an isomorphism. This condition is also necessary; see Problem 14. We will finish this subsection comparing homomorphisms with isomorphisms by observing that a one-to-one homomorphism is an isomorphism from its domain onto its range.
- Definition 2.19
A linear map that is one-to-one is nonsingular.
(In the next section we will see the connection between this use of "nonsingular" for maps and its familiar use for matrices.)
- Example 2.20
This nonsingular homomorphism
gives the obvious correspondence between and the -plane inside of .
The prior observation allows us to adapt some results about isomorphisms to this setting.
- Theorem 2.21
In an -dimensional vector space , these:
- is nonsingular, that is, one-to-one
- has a linear inverse
- , that is,
- if is a basis for then is a basis for
are equivalent statements about a linear map .
- Proof
We will first show that . We will then show that .
For , suppose that the linear map is one-to-one, and so has an inverse. The domain of that inverse is the range of and so a linear combination of two members of that domain has the form . On that combination, the inverse gives this.
Thus the inverse of a one-to-one linear map is automatically linear. But this also gives the implication, because the inverse itself must be one-to-one.
Of the remaining implications, holds because any homomorphism maps to , but a one-to-one map sends at most one member of to .
Next, is true since rank plus nullity equals the dimension of the domain.
For , to show that is a basis for the rangespace we need only show that it is a spanning set, because by assumption the range has dimension . Consider . Expressing as a linear combination of basis elements produces , which gives that , as desired.
Finally, for the implication, assume that is a basis for so that is a basis for . Then every a the unique representation . Define a map from to by
(uniqueness of the representation makes this well-defined). Checking that it is linear and that it is the inverse of are easy.
We've now seen that a linear map shows how the structure of the domain is like that of the range. Such a map can be thought to organize the domain space into inverse images of points in the range. In the special case that the map is one-to-one, each inverse image is a single point and the map is an isomorphism between the domain and the range.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Let be given by . Which of these are in the nullspace? Which are in the rangespace?
- This exercise is recommended for all readers.
- Problem 2
Find the nullspace, nullity, rangespace, and rank of each map.
- given by
- given by
- given by
- the zero map
- This exercise is recommended for all readers.
- Problem 3
Find the nullity of each map.
- of rank five
- of rank one
- , an onto map
- , onto
- This exercise is recommended for all readers.
- Problem 4
What is the nullspace of the differentiation transformation ? What is the nullspace of the second derivative, as a transformation of ? The -th derivative?
- Problem 5
Example 2.7 restates the first condition in the definition of homomorphism as "the shadow of a sum is the sum of the shadows". Restate the second condition in the same style.
- Problem 6
For the homomorphism given by find these.
- This exercise is recommended for all readers.
- Problem 7
For the map given by
sketch these inverse image sets: , , and .
- This exercise is recommended for all readers.
- Problem 8
Each of these transformations of is nonsingular. Find the inverse function of each.
- Problem 9
Describe the nullspace and rangespace of a transformation given by .
- Problem 10
List all pairs that are possible for linear maps from to .
- Problem 11
Does the differentiation map have an inverse?
- This exercise is recommended for all readers.
- Problem 12
Find the nullity of the map given by
- Problem 13
- Prove that a homomorphism is onto if and only if its rank equals the dimension of its codomain.
- Conclude that a homomorphism between vector spaces with the same dimension is one-to-one if and only if it is onto.
- Problem 14
Show that a linear map is nonsingular if and only if it preserves linear independence.
- Problem 15
Corollary 2.17 says that for there to be an onto homomorphism from a vector space to a vector space , it is necessary that the dimension of be less than or equal to the dimension of . Prove that this condition is also sufficient; use Theorem 1.9 to show that if the dimension of is less than or equal to the dimension of , then there is a homomorphism from to that is onto.
- Problem 16
Let be a homomorphism, but not the zero homomorphism. Prove that if is a basis for the nullspace and if is not in the nullspace then is a basis for the entire domain .
- This exercise is recommended for all readers.
- Problem 17
Recall that the nullspace is a subset of the domain and the rangespace is a subset of the codomain. Are they necessarily distinct? Is there a homomorphism that has a nontrivial intersection of its nullspace and its rangespace?
- Problem 18
Prove that the image of a span equals the span of the images. That is, where is linear, prove that if is a subset of then equals . This generalizes Lemma 2.1 since it shows that if is any subspace of then its image is a subspace of , because the span of the set is .
- This exercise is recommended for all readers.
- Problem 19
- Prove that for any linear map and any , the set has the form
- Consider the map given by
- Conclude from the prior two items that for any linear system of the form
- Show that this map is linear
- Show that the -th derivative map is a linear transformation of
for each . Prove that this map is a linear transformation of that space
- Problem 20
Prove that for any transformation that is rank one, the map given by composing the operator with itself satisfies for some real number .
- Problem 21
Show that for any space of dimension , the dual space
is isomorphic to . It is often denoted . Conclude that .
- Problem 22
Show that any linear map is the sum of maps of rank one.
- Problem 23
Is "is homomorphic to" an equivalence relation? (Hint: the difficulty is to decide on an appropriate meaning for the quoted phrase.)
- Problem 24
Show that the rangespaces and nullspaces of powers of linear maps form descending
and ascending
chains. Also show that if is such that then all following rangespaces are equal: . Similarly, if then .
Footnotes
- ↑ More information on many-to-one maps is in the appendix.
Section III - Computing Linear Maps
The prior section shows that a linear map is determined by its action on a basis. In fact, the equation
shows that, if we know the value of the map on the vectors in a basis, then we can compute the value of the map on any vector at all. We just need to find the 's to express with respect to the basis.
This section gives the scheme that computes, from the representation of a vector in the domain , the representation of that vector's image in the codomain , using the representations of , ..., .
1 - Representing Linear Maps with Matrices
- Example 1.1
Consider a map with domain and codomain (fixing
as the bases for these spaces) that is determined by this action on the vectors in the domain's basis.
To compute the action of this map on any vector at all from the domain, we first express and with respect to the codomain's basis:
and
(these are easy to check). Then, as described in the preamble, for any member of the domain, we can express the image in terms of the 's.
Thus,
with then .
For instance,
with then .
We will express computations like the one above with a matrix notation.
In the middle is the argument to the map, represented with respect to the domain's basis by a column vector with components and . On the right is the value of the map on that argument, represented with respect to the codomain's basis by a column vector with components , etc. The matrix on the left is the new thing. It consists of the coefficients from the vector on the right, and from the first row, and from the second row, and and from the third row.
This notation simply breaks the parts from the right, the coefficients and the 's, out separately on the left, into a vector that represents the map's argument and a matrix that we will take to represent the map itself.
- Definition 1.2
Suppose that and are vector spaces of dimensions and with bases and , and that is a linear map. If
then
is the matrix representation of with respect to .
Briefly, the vectors representing the 's are adjoined to make the matrix representing the map.
Observe that the number of columns of the matrix is the dimension of the domain of the map, and the number of rows is the dimension of the codomain.
- Example 1.3
If is given by
then where
the action of on is given by
and a simple calculation gives
showing that this is the matrix representing with respect to the bases.
We will use lower case letters for a map, upper case for the matrix, and lower case again for the entries of the matrix. Thus for the map , the matrix representing it is , with entries .
- Theorem 1.4
Assume that and are vector spaces of dimensions and with bases and , and that is a linear map. If is represented by
and is represented by
then the representation of the image of is this.
- Proof
We will think of the matrix and the vector as combining to make the vector .
- Definition 1.5
The matrix-vector product of a matrix and a vector is this.
The point of Definition 1.2 is to generalize Example 1.1, that is, the point of the definition is Theorem 1.4, that the matrix describes how to get from the representation of a domain vector with respect to the domain's basis to the representation of its image in the codomain with respect to the codomain's basis. With Definition 1.5, we can restate this as: application of a linear map is represented by the matrix-vector product of the map's representative and the vector's representative.
- Example 1.6
With the matrix from Example 1.3 we can calculate where that map sends this vector.
This vector is represented, with respect to the domain basis , by
and so this is the representation of the value with respect to the codomain basis .
To find itself, not its representation, take .
- Example 1.7
Let be projection onto the -plane. To give a matrix representing this map, we first fix bases.
For each vector in the domain's basis, we find its image under the map.
Then we find the representation of each image with respect to the codomain's basis
(these are easily checked). Finally, adjoining these representations gives the matrix representing with respect to .
We can illustrate Theorem 1.4 by computing the matrix-vector product representing the following statement about the projection map.
Representing this vector from the domain with respect to the domain's basis
gives this matrix-vector product.
Expanding this representation into a linear combination of vectors from
checks that the map's action is indeed reflected in the operation of the matrix. (We will sometimes compress these three displayed equations into one
in the course of a calculation.)
We now have two ways to compute the effect of projection, the straightforward formula that drops each three-tall vector's third component to make a two-tall vector, and the above formula that uses representations and matrix-vector multiplication. Compared to the first way, the second way might seem complicated. However, it has advantages. The next example shows that giving a formula for some maps is simplified by this new scheme.
- Example 1.8
To represent a rotation map that turns all vectors in the plane counterclockwise through an angle
we start by fixing bases. Using both as a domain basis and as a codomain basis is natural, Now, we find the image under the map of each vector in the domain's basis.
Then we represent these images with respect to the codomain's basis. Because this basis is , vectors are represented by themselves. Finally, adjoining the representations gives the matrix representing the map.
The advantage of this scheme is that just by knowing how to represent the image of the two basis vectors, we get a formula that tells us the image of any vector at all; here a vector rotated by .
(Again, we are using the fact that, with respect to , vectors represent themselves.)
We have already seen the addition and scalar multiplication operations of matrices and the dot product operation of vectors. Matrix-vector multiplication is a new operation in the arithmetic of vectors and matrices. Nothing in Definition 1.5 requires us to view it in terms of representations. We can get some insight into this operation by turning away from what is being represented, and instead focusing on how the entries combine.
- Example 1.9
In the definition the width of the matrix equals the height of the vector. Hence, the first product below is defined while the second is not.
One reason that this product is not defined is purely formal: the definition requires that the sizes match, and these sizes don't match. Behind the formality, though, is a reason why we will leave it undefined— the matrix represents a map with a three-dimensional domain while the vector represents a member of a two-dimensional space.
A good way to view a matrix-vector product is as the dot products of the rows of the matrix with the column vector.
Looked at in this row-by-row way, this new operation generalizes dot product.
Matrix-vector product can also be viewed column-by-column.
- Example 1.10
The result has the columns of the matrix weighted by the entries of the vector. This way of looking at it brings us back to the objective stated at the start of this section, to compute as .
We began this section by noting that the equality of these two enables us to compute the action of on any argument knowing only , ..., . We have developed this into a scheme to compute the action of the map by taking the matrix-vector product of the matrix representing the map and the vector representing the argument. In this way, any linear map is represented with respect to some bases by a matrix. In the next subsection, we will show the converse, that any matrix represents a linear map.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Multiply the matrix
by each vector (or state "not defined").
- Problem 2
Perform, if possible, each matrix-vector multiplication.
- This exercise is recommended for all readers.
- Problem 3
Solve this matrix equation.
- This exercise is recommended for all readers.
- Problem 4
For a homomorphism from to that sends
where does go?
- This exercise is recommended for all readers.
- Problem 5
Assume that is determined by this action.
Using the standard bases, find
- the matrix representing this map;
- a general formula for .
- This exercise is recommended for all readers.
- Problem 6
Let be the derivative transformation.
- Represent with respect to where .
- Represent with respect to where .
- This exercise is recommended for all readers.
- Problem 7
Represent each linear map with respect to each pair of bases.
- with respect to
where , given by
-
with respect to
where , given by
- with respect to
where
and , given by
- with respect to
where
and , given by
-
with respect
to where , given by
- Problem 8
Represent the identity map on any nontrivial space with respect to , where is any basis.
- Problem 9
Represent, with respect to the natural basis, the transpose transformation on the space of matrices.
- Problem 10
Assume that is a basis for a vector space. Represent with respect to the transformation that is determined by each.
- , , ,
- , , ,
- , , ,
- Problem 11
Example 1.8 shows how to represent the rotation transformation of the plane with respect to the standard basis. Express these other transformations also with respect to the standard basis.
- the dilation map , which multiplies all vectors by the same scalar
- the reflection map , which reflects all all vectors across a line through the origin
- This exercise is recommended for all readers.
- Problem 12
Consider a linear transformation of determined by these two.
- Represent this transformation with respect to the standard bases.
- Where does the transformation send this vector?
- Represent this transformation with respect to these bases.
- Using from the prior item, represent the transformation with respect to .
- Problem 13
Suppose that is nonsingular so that by Theorem II.2.21, for any basis the image is a basis for .
- Represent the map with respect to .
- For a member of the domain, where the representation of has components , ..., , represent the image vector with respect to the image basis .
- Problem 14
Give a formula for the product of a matrix and , the column vector that is all zeroes except for a single one in the -th position.
- This exercise is recommended for all readers.
- Problem 15
For each vector space of functions of one real variable, represent the derivative transformation with respect to .
- ,
- ,
- ,
- Problem 16
Find the range of the linear transformation of represented with respect to the standard bases by each matrix.
- a matrix of the form
- This exercise is recommended for all readers.
- Problem 17
Can one matrix represent two different linear maps? That is, can ?
- Problem 18
Prove Theorem 1.4.
- This exercise is recommended for all readers.
- Problem 19
Example 1.8 shows how to represent rotation of all vectors in the plane through an angle about the origin, with respect to the standard bases.
- Rotation of all vectors in three-space through an angle about the -axis is a transformation of . Represent it with respect to the standard bases. Arrange the rotation so that to someone whose feet are at the origin and whose head is at , the movement appears clockwise.
- Repeat the prior item, only rotate about the -axis instead. (Put the person's head at .)
- Repeat, about the -axis.
- Extend the prior item to . (Hint: "rotate about the -axis" can be restated as "rotate parallel to the -plane".)
- Problem 20 (Schur's Triangularization Lemma)
- Let be a subspace of and fix bases . What is the relationship between the representation of a vector from with respect to and the representation of that vector (viewed as a member of ) with respect to ?
- What about maps?
- Fix a basis
for and observe that the spans
- Conclude that for every linear map there are bases so the matrix representing with respect to is upper-triangular (that is, each entry with is zero).
- Is an upper-triangular representation unique?
2 - Any Matrix Represents a Linear Map
The prior subsection shows that the action of a linear map is described by a matrix , with respect to appropriate bases, in this way.
In this subsection, we will show the converse, that each matrix represents a linear map.
Recall that, in the definition of the matrix representation of a linear map, the number of columns of the matrix is the dimension of the map's domain and the number of rows of the matrix is the dimension of the map's codomain. Thus, for instance, a matrix cannot represent a map from to . The next result says that, beyond this restriction on the dimensions, there are no other limitations: the matrix represents a map from any three-dimensional space to any two-dimensional space.
- Theorem 2.1
Any matrix represents a homomorphism between vector spaces of appropriate dimensions, with respect to any pair of bases.
- Proof
For the matrix
fix any -dimensional domain space and any -dimensional codomain space . Also fix bases and for those spaces. Define a function by: where in the domain is represented as
then its image is the member the codomain represented by
that is, is defined to be . (This is well-defined by the uniqueness of the representation .)
Observe that has simply been defined to make it the map that is represented with respect to by the matrix . So to finish, we need only check that is linear. If are such that
and then the calculation
provides this verification.
- Example 2.2
Which map the matrix represents depends on which bases are used. If
then represented by with respect to maps
while represented by with respect to is this map.
These two are different. The first is projection onto the axis, while the second is projection onto the axis.
So not only is any linear map described by a matrix but any matrix describes a linear map. This means that we can, when convenient, handle linear maps entirely as matrices, simply doing the computations, without have to worry that a matrix of interest does not represent a linear map on some pair of spaces of interest. (In practice, when we are working with a matrix but no spaces or bases have been specified, we will often take the domain and codomain to be and and use the standard bases. In this case, because the representation is transparent— the representation with respect to the standard basis of is — the column space of the matrix equals the range of the map. Consequently, the column space of is often denoted by .)
With the theorem, we have characterized linear maps as those maps that act in this matrix way. Each linear map is described by a matrix and each matrix describes a linear map. We finish this section by illustrating how a matrix can be used to tell things about its maps.
- Theorem 2.3
The rank of a matrix equals the rank of any map that it represents.
- Proof
Suppose that the matrix is . Fix domain and codomain spaces and of dimension and , with bases and . Then represents some linear map between those spaces with respect to these bases whose rangespace
is the span . The rank of is the dimension of this rangespace.
The rank of the matrix is its column rank (or its row rank; the two are equal). This is the dimension of the column space of the matrix, which is the span of the set of column vectors .
To see that the two spans have the same dimension, recall that a representation with respect to a basis gives an isomorphism . Under this isomorphism, there is a linear relationship among members of the rangespace if and only if the same relationship holds in the column space, e.g, if and only if . Hence, a subset of the rangespace is linearly independent if and only if the corresponding subset of the column space is linearly independent. This means that the size of the largest linearly independent subset of the rangespace equals the size of the largest linearly independent subset of the column space, and so the two spaces have the same dimension.
- Example 2.4
Any map represented by
must, by definition, be from a three-dimensional domain to a four-dimensional codomain. In addition, because the rank of this matrix is two (we can spot this by eye or get it with Gauss' method), any map represented by this matrix has a two-dimensional rangespace.
- Corollary 2.5
Let be a linear map represented by a matrix . Then is onto if and only if the rank of equals the number of its rows, and is one-to-one if and only if the rank of equals the number of its columns.
- Proof
For the first half, the dimension of the rangespace of is the rank of , which equals the rank of by the theorem. Since the dimension of the codomain of is the number of rows in , if the rank of equals the number of rows, then the dimension of the rangespace equals the dimension of the codomain. But a subspace with the same dimension as its superspace must equal that superspace (a basis for the rangespace is a linearly independent subset of the codomain, whose size is equal to the dimension of the codomain, and so this set is a basis for the codomain).
For the second half, a linear map is one-to-one if and only if it is an isomorphism between its domain and its range, that is, if and only if its domain has the same dimension as its range. But the number of columns in is the dimension of 's domain, and by the theorem the rank of equals the dimension of 's range.
The above results end any confusion caused by our use of the word "rank" to mean apparently different things when applied to matrices and when applied to maps. We can also justify the dual use of "nonsingular". We've defined a matrix to be nonsingular if it is square and is the matrix of coefficients of a linear system with a unique solution, and we've defined a linear map to be nonsingular if it is one-to-one.
- Corollary 2.6
A square matrix represents nonsingular maps if and only if it is a nonsingular matrix. Thus, a matrix represents an isomorphism if and only if it is square and nonsingular.
- Proof
Immediate from the prior result.
- Example 2.7
Any map from to represented with respect to any pair of bases by
is nonsingular because this matrix has rank two.
- Example 2.8
Any map represented by
is not nonsingular because this matrix is not nonsingular.
We've now seen that the relationship between maps and matrices goes both ways: fixing bases, any linear map is represented by a matrix and any matrix describes a linear map. That is, by fixing spaces and bases we get a correspondence between maps and matrices. In the rest of this chapter we will explore this correspondence. For instance, we've defined for linear maps the operations of addition and scalar multiplication and we shall see what the corresponding matrix operations are. We shall also see the matrix operation that represent the map operation of composition. And, we shall see how to find the matrix that represents a map's inverse.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Decide if the vector is in the column space of the matrix.
- ,
- ,
- ,
- This exercise is recommended for all readers.
- Problem 2
Decide if each vector lies in the range of the map from to represented with respect to the standard bases by the matrix.
- ,
- ,
- This exercise is recommended for all readers.
- Problem 3
Consider this matrix, representing a transformation of , and these bases for that space.
- To what vector in the codomain is the first member of mapped?
- The second member?
- Where is a general vector from the domain (a vector with components and ) mapped? That is, what transformation of is represented with respect to by this matrix?
- Problem 4
What transformation of is represented with respect to and by this matrix?
- This exercise is recommended for all readers.
- Problem 5
Decide if is in the range of the map from to represented with respect to and by this matrix.
- Problem 6
Example 2.8 gives a matrix that is nonsingular, and is therefore associated with maps that are nonsingular.
- Find the set of column vectors representing the members of the nullspace of any map represented by this matrix.
- Find the nullity of any such map.
- Find the set of column vectors representing the members of the rangespace of any map represented by this matrix.
- Find the rank of any such map.
- Check that rank plus nullity equals the dimension of the domain.
- This exercise is recommended for all readers.
- Problem 7
Because the rank of a matrix equals the rank of any map it represents, if one matrix represents two different maps (where ) then the dimension of the rangespace of equals the dimension of the rangespace of . Must these equal-dimensioned rangespaces actually be the same?
- This exercise is recommended for all readers.
- Problem 8
Let be an -dimensional space with bases and . Consider a map that sends, for , the column vector representing with respect to to the column vector representing with respect to . Show that is a linear transformation of .
- Problem 9
Example 2.2 shows that changing the pair of bases can change the map that a matrix represents, even though the domain and codomain remain the same. Could the map ever not change? Is there a matrix , vector spaces and , and associated pairs of bases and (with or or both) such that the map represented by with respect to equals the map represented by with respect to ?
- This exercise is recommended for all readers.
- Problem 10
A square matrix is a diagonal matrix if it is all zeroes except possibly for the entries on its upper-left to lower-right diagonal— its entry, its entry, etc. Show that a linear map is an isomorphism if there are bases such that, with respect to those bases, the map is represented by a diagonal matrix with no zeroes on the diagonal.
- Problem 11
Describe geometrically the action on of the map represented with respect to the standard bases by this matrix.
Do the same for these.
- Problem 12
The fact that for any linear map the rank plus the nullity equals the dimension of the domain shows that a necessary condition for the existence of a homomorphism between two spaces, onto the second space, is that there be no gain in dimension. That is, where is onto, the dimension of must be less than or equal to the dimension of .
- Show that this (strong) converse holds: no gain in dimension implies that there is a homomorphism and, further, any matrix with the correct size and correct rank represents such a map.
- Are there bases for such that
this matrix
- Problem 13
Let be an -dimensional space and suppose that . Fix a basis for and consider the map given by the dot product.
- Show that this map is linear.
- Show that for any linear map there is an such that .
- In the prior item we fixed the basis and varied the to get all possible linear maps. Can we get all possible linear maps by fixing an and varying the basis?
- Problem 14
Let be vector spaces with bases .
- Suppose that is represented with respect to by the matrix . Give the matrix representing the scalar multiple (where ) with respect to by expressing it in terms of .
- Suppose that are represented with respect to by and . Give the matrix representing with respect to by expressing it in terms of and .
- Suppose that is represented with respect to by and is represented with respect to by . Give the matrix representing with respect to by expressing it in terms of and .
Section IV - Matrix Operations
The prior section shows how matrices represent linear maps. A good strategy, on seeing a new idea, is to explore how it interacts with some already-established ideas. In the first subsection we will ask how the representation of the sum of two maps is related to the representations of the two maps, and how the representation of a scalar product of a map is related to the representation of that map. In later subsections we will see how to represent map composition and map inverse.
1 - Sums and Scalar Products
Recall that for two maps and with the same domain and codomain, the map sum has this definition.
The easiest way to see how the representations of the maps combine to represent the map sum is with an example.
- Example 1.1
Suppose that are represented with respect to the bases and by these matrices.
Then, for any represented with respect to , computation of the representation of
gives this representation of .
Thus, the action of is described by this matrix-vector product.
This matrix is the entry-by-entry sum of original matrices, e.g., the entry of is the sum of the entry of and the entry of .
Representing a scalar multiple of a map works the same way.
- Example 1.2
If is a transformation represented by
then the scalar multiple map acts in this way.
Therefore, this is the matrix representing .
- Definition 1.3
The sum of two same-sized matrices is their entry-by-entry sum. The scalar multiple of a matrix is the result of entry-by-entry scalar multiplication.
- Remark 1.4
These extend the vector addition and scalar multiplication operations that we defined in the first chapter.
- Theorem 1.5
Let be linear maps represented with respect to bases by the matrices and , and let be a scalar. Then the map is represented with respect to by , and the map is represented with respect to by .
- Proof
Problem 2; generalize the examples above.
A notable special case of scalar multiplication is multiplication by zero. For any map is the zero homomorphism and for any matrix is the zero matrix.
- Example 1.6
The zero map from any three-dimensional space to any two-dimensional space is represented by the zero matrix
no matter which domain and codomain bases are used.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Perform the indicated operations, if defined.
- Problem 2
Prove Theorem 1.5.
- Prove that matrix addition represents addition of linear maps.
- Prove that matrix scalar multiplication represents scalar multiplication of linear maps.
- This exercise is recommended for all readers.
- Problem 3
Prove each, where the operations are defined, where , , and are matrices, where is the zero matrix, and where and are scalars.
- Matrix addition is commutative .
- Matrix addition is associative .
- The zero matrix is an additive identity .
- Matrices have an additive inverse .
- Problem 4
Fix domain and codomain spaces. In general, one matrix can represent many different maps with respect to different bases. However, prove that a zero matrix represents only a zero map. Are there other such matrices?
- This exercise is recommended for all readers.
- Problem 5
Let and be vector spaces of dimensions and . Show that the space of linear maps from to is isomorphic to .
- This exercise is recommended for all readers.
- Problem 6
Show that it follows from the prior questions that for any six transformations there are scalars such that is the zero map. (Hint: this is a bit of a misleading question.)
- Problem 7
The trace of a square matrix is the sum of the entries on the main diagonal (the entry plus the entry, etc.; we will see the significance of the trace in Chapter Five). Show that . Is there a similar result for scalar multiplication?
- Problem 8
Recall that the transpose of a matrix is another matrix, whose entry is the entry of . Verifiy these identities.
- This exercise is recommended for all readers.
- Problem 9
A square matrix is symmetric if each entry equals the entry, that is, if the matrix equals its transpose.
- Prove that for any , the matrix is symmetric. Does every symmetric matrix have this form?
- Prove that the set of symmetric matrices is a subspace of .
- This exercise is recommended for all readers.
- Problem 10
- How does matrix rank interact with scalar multiplication— can a scalar product of a rank matrix have rank less than ? Greater?
- How does matrix rank interact with matrix addition— can a sum of rank matrices have rank less than ? Greater?
2 - Matrix Multiplication
After representing addition and scalar multiplication of linear maps in the prior subsection, the natural next map operation to consider is composition.
- Lemma 2.1
A composition of linear maps is linear.
- Proof
(This argument has appeared earlier, as part of the proof that isomorphism is an equivalence relation between spaces.) Let and be linear. The calculation
shows that preserves linear combinations.
To see how the representation of the composite arises out of the representations of the two compositors, consider an example.
- Example 2.2
Let and , fix bases , , , and let these be the representations.
To represent the composition we fix a , represent of , and then represent of that. The representation of is the product of 's matrix and 's vector.
The representation of is the product of 's matrix and 's vector.
Distributing and regrouping on the 's gives
which we recognizing as the result of this matrix-vector product.
Thus, the matrix representing has the rows of combined with the columns of .
- Definition 2.3
The matrix-multiplicative product of the matrix and the matrix is the matrix , where
that is, the -th entry of the product is the dot product of the -th row and the -th column.
- Example 2.5
- Theorem 2.6
A composition of linear maps is represented by the matrix product of the representatives.
- Proof
(This argument parallels Example 2.2.) Let and be represented by and with respect to bases , , and , of sizes , , and . For any , the -th component of is
and so the -th component of is this.
Distribute and regroup on the 's.
Finish by recognizing that the coefficient of each
matches the definition of the entry of the product .
The theorem is an example of a result that supports a definition. We can picture what the definition and theorem together say with this arrow diagram ("wrt" abbreviates "with respect to").
Above the arrows, the maps show that the two ways of going from to , straight over via the composition or else by way of , have the same effect
(this is just the definition of composition). Below the arrows, the matrices indicate that the product does the same thing— multiplying into the column vector has the same effect as multiplying the column first by and then multiplying the result by .
The definition of the matrix-matrix product operation does not restrict us to view it as a representation of a linear map composition. We can get insight into this operation by studying it as a mechanical procedure. The striking thing is the way that rows and columns combine.
One aspect of that combination is that the sizes of the matrices involved is significant. Briefly, .
- Example 2.7
This product is not defined
because the number of columns on the left does not equal the number of rows on the right.
In terms of the underlying maps, the fact that the sizes must match up reflects the fact that matrix multiplication is defined only when a corresponding function composition
is possible.
- Remark 2.8
The order in which these things are written can be confusing. In the "" equation, the number written first is the dimension of 's codomain and is thus the number that appears last in the map dimension description above. The explanation is that while is done first and then is applied, that composition is written , from the notation "". (Some people try to lessen confusion by reading "" aloud as " following ".) That order then carries over to matrices: is represented by .
Another aspect of the way that rows and columns combine in the matrix product operation is that in the definition of the entry
the red subscripts on the 's are column indicators while those on the 's indicate rows. That is, summation takes place over the columns of but over the rows of ; left is treated differently than right, so may be unequal to . Matrix multiplication is not commutative.
- Example 2.9
Matrix multiplication hardly ever commutes. Test that by multiplying randomly chosen matrices both ways.
- Example 2.10
Commutativity can fail more dramatically:
while
isn't even defined.
- Remark 2.11
The fact that matrix multiplication is not commutative may be puzzling at first sight, perhaps just because most algebraic operations in elementary mathematics are commutative. But on further reflection, it isn't so surprising. After all, matrix multiplication represents function composition, which is not commutative— if and then while . True, this is not linear and we might have hoped that linear functions commute, but this perspective shows that the failure of commutativity for matrix multiplication fits into a larger context.
Except for the lack of commutativity, matrix multiplication is algebraically well-behaved. Below are some nice properties and more are in Problem 10 and Problem 11.
- Theorem 2.12
If , , and are matrices, and the matrix products are defined, then the product is associative and distributes over matrix addition and .
- Proof
Associativity holds because matrix multiplication represents function composition, which is associative: the maps and are equal as both send to .
Distributivity is similar. For instance, the first one goes (the third equality uses the linearity of ).
- Remark 2.13
We could alternatively prove that result by slogging through the indices. For example, associativity goes: the -th entry of is
(where , , and are , , and matrices), distribute
and regroup around the 's
to get the entry of .
Contrast these two ways of verifying associativity, the one in the proof and the one just above. The argument just above is hard to understand in the sense that, while the calculations are easy to check, the arithmetic seems unconnected to any idea (it also essentially repeats the proof of Theorem 2.6 and so is inefficient). The argument in the proof is shorter, clearer, and says why this property "really" holds. This illustrates the comments made in the preamble to the chapter on vector spaces— at least some of the time an argument from higher-level constructs is clearer.
We have now seen how the representation of the composition of two linear maps is derived from the representations of the two maps. We have called the combination the product of the two matrices. This operation is extremely important. Before we go on to study how to represent the inverse of a linear map, we will explore it some more in the next subsection.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Compute, or state "not defined".
- This exercise is recommended for all readers.
- Problem 2
Where
compute or state "not defined".
- Problem 3
Which products are defined?
- times
- times
- times
- times
- This exercise is recommended for all readers.
- Problem 4
Give the size of the product or state "not defined".
- a matrix times a matrix
- a matrix times a matrix
- a matrix times a matrix
- a matrix times a matrix
- This exercise is recommended for all readers.
- Problem 5
Find the system of equations resulting from starting with
and making this change of variable (i.e., substitution).
- Problem 6
As Definition 2.3 points out, the matrix product operation generalizes the dot product. Is the dot product of a row vector and a column vector the same as their matrix-multiplicative product?
- This exercise is recommended for all readers.
- Problem 7
Represent the derivative map on with respect to where is the natural basis . Show that the product of this matrix with itself is defined; what the map does it represent?
- Problem 8
Show that composition of linear transformations on is commutative. Is this true for any one-dimensional space?
- Problem 9
Why is matrix multiplication not defined as entry-wise multiplication? That would be easier, and commutative too.
- This exercise is recommended for all readers.
- Problem 10
- Prove that and for positive integers .
- Prove that for any positive integer and scalar .
- This exercise is recommended for all readers.
- Problem 11
- How does matrix multiplication interact with scalar multiplication: is ? Is ?
- How does matrix multiplication interact with linear combinations: is ? Is ?
- Problem 12
We can ask how the matrix product operation interacts with the transpose operation.
- Show that .
- A square matrix is symmetric if each entry equals the entry, that is, if the matrix equals its own transpose. Show that the matrices and are symmetric.
- This exercise is recommended for all readers.
- Problem 13
Rotation of vectors in about an axis is a linear map. Show that linear maps do not commute by showing geometrically that rotations do not commute.
- Problem 14
In the proof of Theorem 2.12 some maps are used. What are the domains and codomains?
- Problem 15
How does matrix rank interact with matrix multiplication?
- Can the product of rank matrices have rank less than ? Greater?
- Show that the rank of the product of two matrices is less than or equal to the minimum of the rank of each factor.
- Problem 16
Is "commutes with" an equivalence relation among matrices?
- This exercise is recommended for all readers.
- Problem 17
(This will be used in the Matrix Inverses exercises.) Here is another property of matrix multiplication that might be puzzling at first sight.
- Prove that the composition of the projections onto the and axes is the zero map despite that neither one is itself the zero map.
- Prove that the composition of the derivatives is the zero map despite that neither is the zero map.
- Give a matrix equation representing the first fact.
- Give a matrix equation representing the second.
When two things multiply to give zero despite that neither is zero, each is said to be a zero divisor.
- Problem 18
Show that, for square matrices, need not equal .
- This exercise is recommended for all readers.
- Problem 19
Represent the identity transformation with respect to for any basis . This is the identity matrix . Show that this matrix plays the role in matrix multiplication that the number plays in real number multiplication: (for all matrices for which the product is defined).
- Problem 20
In real number algebra, quadratic equations have at most two solutions. That is not so with matrix algebra. Show that the matrix equation has more than two solutions, where is the identity matrix (this matrix has ones in its and entries and zeroes elsewhere; see Problem 19).
- Problem 21
- Prove that for any matrix there are scalars that are not all such that the combination is the zero matrix (where is the identity matrix, with 's in its and entries and zeroes elsewhere; see Problem 19).
- Let be a polynomial . If is a square matrix we define to be the matrix (where is the appropriately-sized identity matrix). Prove that for any square matrix there is a polynomial such that is the zero matrix.
- The minimal polynomial
of a square matrix is the
polynomial of least degree, and with leading coefficient ,
such that is the zero matrix.
Find the minimal polynomial of this matrix.
- Problem 22
The infinite-dimensional space of all finite-degree polynomials gives a memorable example of the non-commutativity of linear maps. Let be the usual derivative and let be the shift map.
Show that the two maps don't commute ; in fact, not only is not the zero map, it is the identity map.
- Problem 23
Recall the notation for the sum of the sequence of numbers .
In this notation, the entry of the product of and is this.
Using this notation,
- reprove that matrix multiplication is associative;
- reprove Theorem 2.6.
3 - Mechanics of Matrix Multiplication
In this subsection we consider matrix multiplication as a mechanical process, putting aside for the moment any implications about the underlying maps. As described earlier, the striking thing about matrix multiplication is the way rows and columns combine. The entry of the matrix product is the dot product of row of the left matrix with column of the right one. For instance, here a second row and a third column combine to make a entry.
We can view this as the left matrix acting by multiplying its rows, one at a time, into the columns of the right matrix. Of course, another perspective is that the right matrix uses its columns to act on the left matrix's rows. Below, we will examine actions from the left and from the right for some simple matrices.
The first case, the action of a zero matrix, is very easy.
- Example 3.1
Multiplying by an appropriately-sized zero matrix from the left or from the right
results in a zero matrix.
After zero matrices, the matrices whose actions are easiest to understand are the ones with a single nonzero entry.
- Definition 3.2
A matrix with all zeroes except for a one in the entry is an unit matrix.
- Example 3.3
This is the unit matrix with three rows and two columns, multiplying from the left.
Acting from the left, an unit matrix copies row of the multiplicand into row of the result. From the right an unit matrix copies column of the multiplicand into column of the result.
- Example 3.4
Rescaling these matrices simply rescales the result. This is the action from the left of the matrix that is twice the one in the prior example.
And this is the action of the matrix that is minus three times the one from the prior example.
Next in complication are matrices with two nonzero entries. There are two cases. If a left-multiplier has entries in different rows then their actions don't interact.
- Example 3.5
But if the left-multiplier's nonzero entries are in the same row then that row of the result is a combination.
- Example 3.6
Right-multiplication acts in the same way, with columns.
These observations about matrices that are mostly zeroes extend to arbitrary matrices.
- Lemma 3.7
In a product of two matrices and , the columns of are formed by taking times the columns of
and the rows of are formed by taking the rows of times
(ignoring the extra parentheses).
- Proof
We will show the case and leave the general case as an exercise.
The right side of the first equation in the result
is indeed the same as the right side of GH, except for the extra parentheses (the ones marking the columns as column vectors). The other equation is similarly easy to recognize.
An application of those observations is that there is a matrix that just copies out the rows and columns.
- Definition 3.8
The main diagonal (or principal diagonal or diagonal) of a square matrix goes from the upper left to the lower right.
- Definition 3.9
An identity matrix is square and has with all entries zero except for ones in the main diagonal.
- Example 3.10
The identity leaves its multiplicand unchanged both from the left
and from the right.
- Example 3.11
So does the identity matrix.
In short, an identity matrix is the identity element of the set of matrices with respect to the operation of matrix multiplication.
We next see two ways to generalize the identity matrix.
The first is that if the ones are relaxed to arbitrary reals, the resulting matrix will rescale whole rows or columns.
- Definition 3.12
A diagonal matrix is square and has zeros off the main diagonal.
- Example 3.13
From the left, the action of multiplication by a diagonal matrix is to rescales the rows.
From the right such a matrix rescales the columns.
The second generalization of identity matrices is that we can put a single one in each row and column in ways other than putting them down the diagonal.
- Definition 3.14
A permutation matrix is square and is all zeros except for a single one in each row and column.
- Example 3.15
From the left these matrices permute rows.
From the right they permute columns.
We finish this subsection by applying these observations to get matrices that perform Gauss' method and Gauss-Jordan reduction.
- Example 3.16
We have seen how to produce a matrix that will rescale rows. Multiplying by this diagonal matrix rescales the second row of the other by a factor of three.
We have seen how to produce a matrix that will swap rows. Multiplying by this permutation matrix swaps the first and third rows.
To see how to perform a pivot, we observe something about those two examples. The matrix that rescales the second row by a factor of three arises in this way from the identity.
Similarly, the matrix that swaps first and third rows arises in this way.
- Example 3.17
The matrix that arises as
will, when it acts from the left, perform the pivot operation .
- Definition 3.18
The elementary reduction matrices are obtained from identity matrices with one Gaussian operation. We denote them:
- for ;
- for ;
- for .
- Lemma 3.19
Gaussian reduction can be done through matrix multiplication.
- If then .
- If then .
- If then .
- Proof
Clear.
- Example 3.20
This is the first system, from the first chapter, on which we performed Gauss' method.
It can be reduced with matrix multiplication. Swap the first and third rows,
triple the first row,
and then add times the first row to the second.
Now back substitution will give the solution.
- Example 3.21
Gauss-Jordan reduction works the same way. For the matrix ending the prior example, first adjust the leading entries
and to finish, clear the third column and then the second column.
We have observed the following result, which we shall use in the next subsection.
- Corollary 3.22
For any matrix there are elementary reduction matrices , ..., such that is in reduced echelon form.
Until now we have taken the point of view that our primary objects of study are vector spaces and the maps between them, and have adopted matrices only for computational convenience. This subsection show that this point of view isn't the whole story. Matrix theory is a fascinating and fruitful area.
In the rest of this book we shall continue to focus on maps as the primary objects, but we will be pragmatic— if the matrix point of view gives some clearer idea then we shall use it.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Predict the result of each multiplication by an elementary reduction matrix, and then check by multiplying it out.
- This exercise is recommended for all readers.
- Problem 2
The need to take linear combinations of rows and columns in tables of numbers arises often in practice. For instance, this is a map of part of Vermont and New York.
- The incidence matrix of a map is the square matrix whose entry is the number of roads from city to city . Produce the incidence matrix of this map (take the cities in alphabetical order).
- A matrix is symmetric if it equals its transpose. Show that an incidence matrix is symmetric. (These are all two-way streets. Vermont doesn't have many one-way streets.)
- What is the significance of the square of the incidence matrix? The cube?
- This exercise is recommended for all readers.
- Problem 3
This table gives the number of hours of each type done by each worker, and the associated pay rates. Use matrices to compute the wages due.
|
|
(Remark. This illustrates, as did the prior problem, that in practice we often want to compute linear combinations of rows and columns in a context where we really aren't interested in any associated linear maps.)
- Problem 4
Find the product of this matrix with its transpose.
- This exercise is recommended for all readers.
- Problem 5
Prove that the diagonal matrices form a subspace of . What is its dimension?
- Problem 6
Does the identity matrix represent the identity map if the bases are unequal?
- Problem 7
Show that every multiple of the identity commutes with every square matrix. Are there other matrices that commute with all square matrices?
- Problem 8
Prove or disprove: nonsingular matrices commute.
- This exercise is recommended for all readers.
- Problem 9
Show that the product of a permutation matrix and its transpose is an identity matrix.
- Problem 10
Show that if the first and second rows of are equal then so are the first and second rows of . Generalize.
- Problem 11
Describe the product of two diagonal matrices.
- Problem 12
Write
as the product of two elementary reduction matrices.
- This exercise is recommended for all readers.
- Problem 13
Show that if has a row of zeros then (if defined) has a row of zeros. Does that work for columns?
- Problem 14
Show that the set of unit matrices forms a basis for .
- Problem 15
Find the formula for the -th power of this matrix.
- This exercise is recommended for all readers.
- Problem 16
The trace of a square matrix is the sum of the entries on its diagonal (its significance appears in Chapter Five). Show that .
- This exercise is recommended for all readers.
- Problem 17
A square matrix is upper triangular if its only nonzero entries lie above, or on, the diagonal. Show that the product of two upper triangular matrices is upper triangular. Does this hold for lower triangular also?
- Problem 18
A square matrix is a Markov matrix if each entry is between zero and one and the sum along each row is one. Prove that a product of Markov matrices is Markov.
- This exercise is recommended for all readers.
- Problem 19
Give an example of two matrices of the same rank with squares of differing rank.
- Problem 20
Combine the two generalizations of the identity matrix, the one allowing entires to be other than ones, and the one allowing the single one in each row and column to be off the diagonal. What is the action of this type of matrix?
- Problem 21
On a computer multiplications are more costly than additions, so people are interested in reducing the number of multiplications used to compute a matrix product.
- How many real number multiplications are needed in formula we gave for the product of a matrix and a matrix?
- Matrix multiplication is associative, so all associations yield the same result. The cost in number of multiplications, however, varies. Find the association requiring the fewest real number multiplications to compute the matrix product of a matrix, a matrix, a matrix, and a matrix.
- (Very hard.) Find a way to multiply two matrices using only seven multiplications instead of the eight suggested by the naive approach.
- ? Problem 22
If and are square matrices of the same size such that , does it follow that ? (Putnam Exam 1990)
- Problem 23
Demonstrate these four assertions to get an alternate proof that column rank equals row rank. (Liebeck 1966)
- iff .
- iff .
- .
- .
- Problem 24
Prove (where is an matrix and so defines a transformation of any -dimensional space with respect to where is a basis) that . Conclude
- iff ;
- iff ;
- iff and ;
- iff ;
- (Requires the Direct Sum subsection, which is optional.) iff .
4 - Inverses
We now consider how to represent the inverse of a linear map.
We start by recalling some facts about function inverses.[1] Some functions have no inverse, or have an inverse on the left side or right side only.
- Example 4.1
Where is the projection map
and is the embedding
the composition is the identity map on .
We say is a left inverse map of or, what is the same thing, that is a right inverse map of . However, composition in the other order doesn't give the identity map— here is a vector that is not sent to itself under .
In fact, the projection has no left inverse at all. For, if were to be a left inverse of then we would have
for all of the infinitely many 's. But no function can send a single argument to more than one value.
(An example of a function with no inverse on either side is the zero transformation on .) Some functions have a two-sided inverse map, another function that is the inverse of the first, both from the left and from the right. For instance, the map given by has the two-sided inverse . In this subsection we will focus on two-sided inverses. The appendix shows that a function has a two-sided inverse if and only if it is both one-to-one and onto. The appendix also shows that if a function has a two-sided inverse then it is unique, and so it is called "the" inverse, and is denoted . So our purpose in this subsection is, where a linear map has an inverse, to find the relationship between and (recall that we have shown, in Theorem II.2.21 of Section II of this chapter, that if a linear map has an inverse then the inverse is a linear map also).
- Definition 4.2
A matrix is a left inverse matrix of the matrix if is the identity matrix. It is a right inverse matrix if is the identity. A matrix with a two-sided inverse is an invertible matrix. That two-sided inverse is called the inverse matrix and is denoted .
Because of the correspondence between linear maps and matrices, statements about map inverses translate into statements about matrix inverses.
- Lemma 4.3
If a matrix has both a left inverse and a right inverse then the two are equal.
- Theorem 4.4
A matrix is invertible if and only if it is nonsingular.
- Proof
(For both results.) Given a matrix , fix spaces of appropriate dimension for the domain and codomain. Fix bases for these spaces. With respect to these bases, represents a map . The statements are true about the map and therefore they are true about the matrix.
- Lemma 4.5
A product of invertible matrices is invertible— if and are invertible and if is defined then is invertible and .
- Proof
(This is just like the prior proof except that it requires two maps.) Fix appropriate spaces and bases and consider the represented maps and . Note that is a two-sided map inverse of since and . This equality is reflected in the matrices representing the maps, as required.
Here is the arrow diagram giving the relationship between map inverses and matrix inverses. It is a special case of the diagram for function composition and matrix multiplication.
Beyond its place in our general program of seeing how to represent map operations, another reason for our interest in inverses comes from solving linear systems. A linear system is equivalent to a matrix equation, as here.
By fixing spaces and bases (e.g., and ), we take the matrix to represent some map . Then solving the system is the same as asking: what domain vector is mapped by to the result ? If we could invert then we could solve the system by multiplying to get .
- Example 4.6
We can find a left inverse for the matrix just given
by using Gauss' method to solve the resulting linear system.
Answer: , , , and . This matrix is actually the two-sided inverse of , as can easily be checked. With it we can solve the system () above by applying the inverse.
- Remark 4.7
Why solve systems this way, when Gauss' method takes less arithmetic (this assertion can be made precise by counting the number of arithmetic operations, as computer algorithm designers do)? Beyond its conceptual appeal of fitting into our program of discovering how to represent the various map operations, solving linear systems by using the matrix inverse has at least two advantages.
First, once the work of finding an inverse has been done, solving a system with the same coefficients but different constants is easy and fast: if we change the entries on the right of the system () then we get a related problem
with a related solution method.
In applications, solving many systems having the same matrix of coefficients is common.
Another advantage of inverses is that we can explore a system's sensitivity to changes in the constants. For example, tweaking the on the right of the system () to
can be solved with the inverse.
to show that changes by of the tweak while moves by of that tweak. This sort of analysis is used, for example, to decide how accurately data must be specified in a linear model to ensure that the solution has a desired accuracy.
We finish by describing the computational procedure usually used to find the inverse matrix.
- Lemma 4.8
A matrix is invertible if and only if it can be written as the product of elementary reduction matrices. The inverse can be computed by applying to the identity matrix the same row steps, in the same order, as are used to Gauss-Jordan reduce the invertible matrix.
- Proof
A matrix is invertible if and only if it is nonsingular and thus Gauss-Jordan reduces to the identity. By Corollary 3.22 this reduction can be done with elementary matrices . This equation gives the two halves of the result.
First, elementary matrices are invertible and their inverses are also elementary. Applying to the left of both sides of that equation, then , etc., gives as the product of elementary matrices (the is here to cover the trivial case).
Second, matrix inverses are unique and so comparison of the above equation with shows that . Therefore, applying to the identity, followed by , etc., yields the inverse of .
- Example 4.9
To find the inverse of
we do Gauss-Jordan reduction, meanwhile performing the same operations on the identity. For clerical convenience we write the matrix and the identity side-by-side, and do the reduction steps together.
This calculation has found the inverse.
- Example 4.10
This one happens to start with a row swap.
- Example 4.11
A non-invertible matrix is detected by the fact that the left half won't reduce to the identity.
This procedure will find the inverse of a general matrix. The case is handy.
- Corollary 4.12
The inverse for a matrix exists and equals
if and only if .
- Proof
This computation is Problem 10.
We have seen here, as in the Mechanics of Matrix Multiplication subsection, that we can exploit the correspondence between linear maps and matrices. So we can fruitfully study both maps and matrices, translating back and forth to whichever helps us the most.
Over the entire four subsections of this section we have developed an algebra system for matrices. We can compare it with the familiar algebra system for the real numbers. Here we are working not with numbers but with matrices. We have matrix addition and subtraction operations, and they work in much the same way as the real number operations, except that they only combine same-sized matrices. We also have a matrix multiplication operation and an operation inverse to multiplication. These are somewhat like the familiar real number operations (associativity, and distributivity over addition, for example), but there are differences (failure of commutativity, for example). And, we have scalar multiplication, which is in some ways another extension of real number multiplication. This matrix system provides an example that algebra systems other than the elementary one can be interesting and useful.
Exercises
- Problem 1
Supply the intermediate steps in Example 4.10.
- This exercise is recommended for all readers.
- This exercise is recommended for all readers.
- Problem 3
For each invertible matrix in the prior problem, use Corollary 4.12 to find its inverse.
- This exercise is recommended for all readers.
- Problem 4
Find the inverse, if it exists, by using the Gauss-Jordan method. Check the answers for the matrices with Corollary 4.12.
- This exercise is recommended for all readers.
- Problem 5
What matrix has this one for its inverse?
- Problem 6
How does the inverse operation interact with scalar multiplication and addition of matrices?
- What is the inverse of ?
- Is ?
- This exercise is recommended for all readers.
- Problem 7
Is ?
- Problem 8
Is invertible?
- Problem 9
For each real number let be represented with respect to the standard bases by this matrix.
Show that . Show also that .
- Problem 10
Do the calculations for the proof of Corollary 4.12.
- Problem 11
Show that this matrix
has infinitely many right inverses. Show also that it has no left inverse.
- Problem 12
In Example 4.1, how many left inverses has ?
- Problem 13
If a matrix has infinitely many right-inverses, can it have infinitely many left-inverses? Must it have?
- This exercise is recommended for all readers.
- Problem 14
Assume that is invertible and that is the zero matrix. Show that is a zero matrix.
- Problem 15
Prove that if is invertible then the inverse commutes with a matrix if and only if itself commutes with that matrix .
- This exercise is recommended for all readers.
- Problem 16
Show that if is square and if is the zero matrix then . Generalize.
- This exercise is recommended for all readers.
- Problem 17
Let be diagonal. Describe , , ... , etc. Describe , , ... , etc. Define appropriately.
- Problem 18
Prove that any matrix row-equivalent to an invertible matrix is also invertible.
- Problem 19
The first question below appeared as Problem 15 in the Matrix Multiplication subsection.
- Show that the rank of the product of two matrices is less than or equal to the minimum of the rank of each.
- Show that if and are square then if and only if .
- Problem 20
Show that the inverse of a permutation matrix is its transpose.
- Problem 21
The first two parts of this question appeared as Problem 12. of the Matrix Multiplication subsection
- Show that .
- A square matrix is symmetric if each entry equals the entry (that is, if the matrix equals its transpose). Show that the matrices and are symmetric.
- Show that the inverse of the transpose is the transpose of the inverse.
- Show that the inverse of a symmetric matrix is symmetric.
- This exercise is recommended for all readers.
- Problem 22
The items starting this question appeared as Problem 17 of the Matrix Multiplication subsection.
- Prove that the composition of the projections is the zero map despite that neither is the zero map.
- Prove that the composition of the derivatives is the zero map despite that neither map is the zero map.
- Give matrix equations representing each of the prior two items.
When two things multiply to give zero despite that neither is zero, each is said to be a zero divisor. Prove that no zero divisor is invertible.
- Problem 23
In real number algebra, there are exactly two numbers, and , that are their own multiplicative inverse. Does have exactly two solutions for matrices?
- Problem 24
Is the relation "is a two-sided inverse of" transitive? Reflexive? Symmetric?
- Problem 25
Prove: if the sum of the elements in each row of a square matrix is , then the sum of the elements in each row of the inverse matrix is . (Wilansky 1951)
Footnotes
- ↑ More information on function inverses is in the appendix.
Section V - Change of Basis
Representations, whether of vectors or of maps, vary with the bases. For instance, with respect to the two bases and
for , the vector has two different representations.
Similarly, with respect to and , the identity map has two different representations.
With our point of view that the objects of our studies are vectors and maps, in fixing bases we are adopting a scheme of tags or names for these objects, that are convienent for computation. We will now see how to translate among these names— we will see exactly how representations vary as the bases vary.
1 - Changing Representations of Vectors
In converting to the underlying vector doesn't change. Thus, this translation is accomplished by the identity map on the space, described so that the domain space vectors are represented with respect to and the codomain space vectors are represented with respect to .
(The diagram is vertical to fit with the ones in the next subsection.)
- Definition 1.1
The change of basis matrixfor bases is the representation of the identity map with respect to those bases.
- Lemma 1.2
Left-multiplication by the change of basis matrix for converts a representation with respect to to one with respect to . Conversly, if left-multiplication by a matrix changes bases then is a change of basis matrix.
- Proof
For the first sentence, for each , as matrix-vector multiplication represents a map application, . For the second sentence, with respect to the matrix represents some linear map, whose action is , and is therefore the identity map.
- Example 1.3
With these bases for ,
because
the change of basis matrix is this.
We can see this matrix at work by finding the two representations of
and checking that the conversion goes as expected.
We finish this subsection by recognizing that the change of basis matrices are familiar.
- Lemma 1.4
A matrix changes bases if and only if it is nonsingular.
- Proof
For one direction, if left-multiplication by a matrix changes bases then the matrix represents an invertible function, simply because the function is inverted by changing the bases back. Such a matrix is itself invertible, and so nonsingular.
To finish, we will show that any nonsingular matrix performs a change of basis operation from any given starting basis to some ending basis. Because the matrix is nonsingular, it will Gauss-Jordan reduce to the identity, so there are elementatry reduction matrices such that . Elementary matrices are invertible and their inverses are also elementary, so multiplying from the left first by , then by , etc., gives as a product of elementary matrices . Thus, we will be done if we show that elementary matrices change a given basis to another basis, for then changes to some other basis , and changes to some , ..., and the net effect is that changes to . We will prove this about elementary matrices by covering the three types as separate cases.
Applying a row-multiplication matrix
changes a representation with respect to to one with respect to in this way.
Similarly, left-multiplication by a row-swap matrix changes a representation with respect to the basis into one with respect to the basis in this way.
And, a representation with respect to changes via left-multiplication by a row-combination matrix into a representation with respect to
(the definition of reduction matrices specifies that and and so this last one is a basis).
- Corollary 1.5
A matrix is nonsingular if and only if it represents the identity map with respect to some pair of bases.
In the next subsection we will see how to translate among representations of maps, that is, how to change to . The above corollary is a special case of this, where the domain and range are the same space, and where the map is the identity map.
Exercises
- This exercise is recommended for all readers.
- Problem 1
In , where
find the change of basis matrices from to and from to . Multiply the two.
- This exercise is recommended for all readers.
- Problem 2
Find the change of basis matrix for .
- ,
- ,
- ,
- ,
- Problem 3
For the bases in Problem 2, find the change of basis matrix in the other direction, from to .
- This exercise is recommended for all readers.
- Problem 4
Find the change of basis matrix for each .
- This exercise is recommended for all readers.
- Problem 5
Decide if each changes bases on . To what basis is changed?
- Problem 6
Find bases such that this matrix represents the identity map with respect to those bases.
- Problem 7
Conside the vector space of real-valued functions with basis . Show that is also a basis for this space. Find the change of basis matrix in each direction.
- Problem 8
Where does this matrix
send the standard basis for ? Any other bases? Hint. Consider the inverse.
- This exercise is recommended for all readers.
- Problem 9
What is the change of basis matrix with respect to ?
- Problem 10
Prove that a matrix changes bases if and only if it is invertible.
- Problem 11
Finish the proof of Lemma 1.4.
- This exercise is recommended for all readers.
- Problem 12
Let be a nonsingular matrix. What basis of does change to the standard basis?
- This exercise is recommended for all readers.
- Problem 13
- In with basis
we have this
represenatation.
- State and prove that any nonzero vector representation can be changed to any other.
Hint. The proof of Lemma 1.4 is constructive— it not only says the bases change, it shows how they change.
- Problem 14
Let be vector spaces, and let be bases for and be bases for . Where is linear, find a formula relating to .
- This exercise is recommended for all readers.
- Problem 15
Show that the columns of an change of basis matrix form a basis for . Do all bases appear in that way: can the vectors from any basis make the columns of a change of basis matrix?
- This exercise is recommended for all readers.
- Problem 16
Find a matrix having this effect.
That is, find a that left-multiplies the starting vector to yield the ending vector. Is there a matrix having these two effects?
Give a necessary and sufficient condition for there to be a matrix such that and .
2 - Changing Map Representations
The first subsection shows how to convert the representation of a vector with respect to one basis to the representation of that same vector with respect to another basis. Here we will see how to convert the representation of a map with respect to one pair of bases to the representation of that map with respect to a different pair. That is, we want the relationship between the matrices in this arrow diagram.
To move from the lower-left of this diagram to the lower-right we can either go straight over, or else up to then over to and then down. Restated in terms of the matrices, we can calculate either by simply using and , or else by first changing bases with then multiplying by and then changing bases with . This equation summarizes.
(To compare this equation with the sentence before it, remember that the equation is read from right to left because function composition is read right to left and matrix multiplication represent the composition.)
- Example 2.1
The matrix
represents, with respect to , the transformation that rotates vectors radians counterclockwise.
We can translate that representation with respect to to one with respect to
by using the arrow diagram and formula () above.
From this, we can use the formula:
Note that can be calculated as
the matrix inverse of .
Although the new matrix is messier-appearing, the map that it represents is the same. For instance, to replicate the effect of in the picture, start with ,
apply ,
and check it against
to see that it is the same result as above.
- Example 2.2
On the map
that is represented with respect to the standard basis in this way
can also be represented with respect to another basis
if then
in a way that is simpler, in that the action of a diagonal matrix is easy to understand.
Naturally, we usually prefer basis changes that make the representation easier to understand. When the representation with respect to equal starting and ending bases is a diagonal matrix we say the map or matrix has been diagonalized. In Chaper Five we shall see which maps and matrices are diagonalizable, and where one is not, we shall see how to get a representation that is nearly diagonal.
We finish this subsection by considering the easier case where representations are with respect to possibly different starting and ending bases. Recall that the prior subsection shows that a matrix changes bases if and only if it is nonsingular. That gives us another version of the above arrow diagram and equation ().
- Definition 2.3
Same-sized matrices and are matrix equivalent if there are nonsingular matrices and such that .
- Corollary 2.4
Matrix equivalent matrices represent the same map, with respect to appropriate pairs of bases.
Problem 10 checks that matrix equivalence is an equivalence relation. Thus it partitions the set of matrices into matrix equivalence classes.
All matrices: | ![]() | matrix equivalent to |
We can get some insight into the classes by comparing matrix equivalence with row equivalence (recall that matrices are row equivalent when they can be reduced to each other by row operations). In , the matrices and are nonsingular and thus each can be written as a product of elementary reduction matrices (see Lemma 4.8 in the previous subsection). Left-multiplication by the reduction matrices making up has the effect of performing row operations. Right-multiplication by the reduction matrices making up performs column operations. Therefore, matrix equivalence is a generalization of row equivalence— two matrices are row equivalent if one can be converted to the other by a sequence of row reduction steps, while two matrices are matrix equivalent if one can be converted to the other by a sequence of row reduction steps followed by a sequence of column reduction steps.
Thus, if matrices are row equivalent then they are also matrix equivalent (since we can take to be the identity matrix and so perform no column operations). The converse, however, does not hold.
- Example 2.5
These two
are matrix equivalent because the second can be reduced to the first by the column operation of taking times the first column and adding to the second. They are not row equivalent because they have different reduced echelon forms (in fact, both are already in reduced form).
We will close this section by finding a set of representatives for the matrix equivalence classes.[1]
- Theorem 2.6
Any matrix of rank is matrix equivalent to the matrix that is all zeros except that the first diagonal entries are ones.
Sometimes this is described as a block partial-identity form.
- Proof
As discussed above, Gauss-Jordan reduce the given matrix and combine all the reduction matrices used there to make . Then use the leading entries to do column reduction and finish by swapping columns to put the leading ones on the diagonal. Combine the reduction matrices used for those column operations into .
- Example 2.7
We illustrate the proof by finding the and for this matrix.
First Gauss-Jordan row-reduce.
Then column-reduce, which involves right-multiplication.
Finish by swapping columns.
Finally, combine the left-multipliers together as and the right-multipliers together as to get the equation.
- Corollary 2.8
Two same-sized matrices are matrix equivalent if and only if they have the same rank. That is, the matrix equivalence classes are characterized by rank.
- Proof
Two same-sized matrices with the same rank are equivalent to the same block partial-identity matrix.
- Example 2.9
The matrices have only three possible ranks: zero, one, or two. Thus there are three matrix-equivalence classes.
Each class consists of all of the matrices with the same rank. There is only one rank zero matrix, so that class has only one member, but the other two classes each have infinitely many members.
In this subsection we have seen how to change the representation of a map with respect to a first pair of bases to one with respect to a second pair. That led to a definition describing when matrices are equivalent in this way. Finally we noted that, with the proper choice of (possibly different) starting and ending bases, any map can be represented in block partial-identity form.
One of the nice things about this representation is that, in some sense, we can completely understand the map when it is expressed in this way: if the bases are and then the map sends
where is the map's rank. Thus, we can understand any linear map as a kind of projection.
Of course, "understanding" a map expressed in this way requires that we understand the relationship between and . However, despite that difficulty, this is a good classification of linear maps. }}
Exercises
- This exercise is recommended for all readers.
- Problem 1
Decide if these matrices are matrix equivalent.
- ,
- ,
- ,
- This exercise is recommended for all readers.
- Problem 2
Find the canonical representative of the matrix-equivalence class of each matrix.
- Problem 3
Suppose that, with respect to
the transformation is represented by this matrix.
Use change of basis matrices to represent with respect to each pair.
- ,
- ,
- This exercise is recommended for all readers.
- Problem 4
What sizes are and in the equation ?
- This exercise is recommended for all readers.
- Problem 5
Use Theorem 2.6 to show that a square matrix is nonsingular if and only if it is equivalent to an identity matrix.
- This exercise is recommended for all readers.
- Problem 6
Show that, where is a nonsingular square matrix, if and are nonsingular square matrices such that then .
- This exercise is recommended for all readers.
- Problem 7
Why does Theorem 2.6 not show that every matrix is diagonalizable (see Example 2.2)?
- Problem 8
Must matrix equivalent matrices have matrix equivalent transposes?
- Problem 9
What happens in Theorem 2.6 if ?
- This exercise is recommended for all readers.
- Problem 10
Show that matrix-equivalence is an equivalence relation.
- This exercise is recommended for all readers.
- Problem 11
Show that a zero matrix is alone in its matrix equivalence class. Are there other matrices like that?
- Problem 12
What are the matrix equivalence classes of matrices of transformations on ? ?
- Problem 13
How many matrix equivalence classes are there?
- Problem 14
Are matrix equivalence classes closed under scalar multiplication? Addition?
- Problem 15
Let represented by with respect to .
- Find in this specific case.
- Describe in the general case where .
- Problem 16
- Let have bases and and suppose that has the basis . Where , find the formula that computes from .
- Repeat the prior question with one basis for and two bases for .
- Problem 17
- If two matrices are matrix-equivalent and invertible, must their inverses be matrix-equivalent?
- If two matrices have matrix-equivalent inverses, must the two be matrix-equivalent?
- If two matrices are square and matrix-equivalent, must their squares be matrix-equivalent?
- If two matrices are square and have matrix-equivalent squares, must they be matrix-equivalent?
- This exercise is recommended for all readers.
- Problem 18
Square matrices are similar if they represent the same transformation, but each with respect to the same ending as starting basis. That is, is similar to .
- Give a definition of matrix similarity like that of Definition 2.3.
- Prove that similar matrices are matrix equivalent.
- Show that similarity is an equivalence relation.
- Show that if is similar to then is similar to , the cubes are similar, etc. Contrast with the prior exercise.
- Prove that there are matrix equivalent matrices that are not similar.
Footnotes
- ↑ More information on class representatives is in the appendix.
Section VI - Projection
This section is optional; only the last two sections of Chapter Five require this material.
We have described the projection from into its plane subspace as a "shadow map". This shows why, but it also shows that some shadows fall upward.
![]() | ![]() |
So perhaps a better description is: the projection of is the in the plane with the property that someone standing on and looking straight up or down sees . In this section we will generalize this to other projections, both orthogonal (i.e., "straight up and down") and nonorthogonal.
1 - Orthogonal Projection Onto a Line
We first consider orthogonal projection onto a line. To orthogonally project a vector onto a line , mark the point on the line at which someone standing on that point could see by looking straight up or down (from that person's point of view).
The picture shows someone who has walked out on the line until the tip of is straight overhead. That is, where the line is described as the span of some nonzero vector , the person has walked out to find the coefficient with the property that is orthogonal to .
We can solve for this coefficient by noting that because is orthogonal to a scalar multiple of it must be orthogonal to itself, and then the consequent fact that the dot product is zero gives that .
- Definition 1.1
The orthogonal projection of onto the line spanned by a nonzero is this vector.
Problem 13 checks that the outcome of the calculation depends only on the line and not on which vector happens to be used to describe that line.
- Remark 1.2
The wording of that definition says "spanned by " instead the more formal "the span of the set ". This casual first phrase is common.
- Example 1.3
To orthogonally project the vector onto the line , we first pick a direction vector for the line. For instance,
will do. Then the calculation is routine.
![]() |
- Example 1.4
In , the orthogonal projection of a general vector
onto the -axis is
which matches our intuitive expectation.
The picture above with the stick figure walking out on the line until 's tip is overhead is one way to think of the orthogonal projection of a vector onto a line. We finish this subsection with two other ways.
- Example 1.5
A railroad car left on an east-west track without its brake is pushed by a wind blowing toward the northeast at fifteen miles per hour; what speed will the car reach?
For the wind we use a vector of length that points toward the northeast.
The car can only be affected by the part of the wind blowing in the east-west direction— the part of in the direction of the -axis is this (the picture has the same perspective as the railroad car picture above).
![]() |
So the car will reach a velocity of miles per hour toward the east.
Thus, another way to think of the picture that precedes the definition is that it shows as decomposed into two parts, the part with the line (here, the part with the tracks, ), and the part that is orthogonal to the line (shown here lying on the north-south axis). These two are "not interacting" or "independent", in the sense that the east-west car is not at all affected by the north-south part of the wind (see Problem 5). So the orthogonal projection of onto the line spanned by can be thought of as the part of that lies in the direction of .
Finally, another useful way to think of the orthogonal projection is to have the person stand not on the line, but on the vector that is to be projected to the line. This person has a rope over the line and pulls it tight, naturally making the rope orthogonal to the line.
That is, we can think of the projection as being the vector in the line that is closest to (see Problem 11).
- Example 1.6
A submarine is tracking a ship moving along the line . Torpedo range is one-half mile. Can the sub stay where it is, at the origin on the chart below, or must it move to reach a place where the ship will pass within range?
The formula for projection onto a line does not immediately apply because the line doesn't pass through the origin, and so isn't the span of any . To adjust for this, we start by shifting the entire map down two units. Now the line is , which is a subspace, and we can project to get the point of closest approach, the point on the line through the origin closest to
the sub's shifted position.
The distance between and is approximately miles and so the sub must move to get in range.
This subsection has developed a natural projection map: orthogonal projection onto a line. As suggested by the examples, it is often called for in applications. The next subsection shows how the definition of orthogonal projection onto a line gives us a way to calculate especially convienent bases for vector spaces, again something that is common in applications. The final subsection completely generalizes projection, orthogonal or not, onto any subspace at all.
Exercises
- This exercise is recommended for all readers.
- Problem 1
Project the first vector orthogonally onto the line spanned by the second vector.
- ,
- ,
- ,
- ,
- This exercise is recommended for all readers.
- Problem 2
Project the vector orthogonally onto the line.
- , the line
- Problem 3
Although the development of Definition 1.1 is guided by the pictures, we are not restricted to spaces that we can draw. In project this vector onto this line.
- This exercise is recommended for all readers.
- Problem 4
Definition 1.1 uses two vectors and . Consider the transformation of resulting from fixing
and projecting onto the line that is the span of . Apply it to these vectors.
Show that in general the projection tranformation is this.
Express the action of this transformation with a matrix.
- Problem 5
Example 1.5 suggests that projection breaks into two parts, and , that are "not interacting". Recall that the two are orthogonal. Show that any two nonzero orthogonal vectors make up a linearly independent set.
- Problem 6
- What is the orthogonal projection of onto a line if is a member of that line?
- Show that if is not a member of the line then the set is linearly independent.
- Problem 7
Definition 1.1 requires that be nonzero. Why? What is the right definition of the orthogonal projection of a vector onto the (degenerate) line spanned by the zero vector?
- Problem 8
Are all vectors the projection of some other vector onto some line?
- This exercise is recommended for all readers.
- Problem 9
Show that the projection of onto the line spanned by has length equal to the absolute value of the number divided by the length of the vector .
- Problem 10
Find the formula for the distance from a point to a line.
- Problem 11
Find the scalar such that is a minimum distance from the point by using calculus (i.e., consider the distance function, set the first derivative equal to zero, and solve). Generalize to .
- This exercise is recommended for all readers.
- Problem 12
Prove that the orthogonal projection of a vector onto a line is shorter than the vector.
- This exercise is recommended for all readers.
- Problem 13
Show that the definition of orthogonal projection onto a line does not depend on the spanning vector: if is a nonzero multiple of then equals .
- This exercise is recommended for all readers.
- Problem 14
Consider the function mapping to plane to itself that takes a vector to its projection onto the line . These two each show that the map is linear, the first one in a way that is bound to the coordinates (that is, it fixes a basis and then computes) and the second in a way that is more conceptual.
- Produce a matrix that describes the function's action.
- Show also that this map can be obtained by first rotating everything in the plane radians clockwise, then projecting onto the -axis, and then rotating radians counterclockwise.
- Problem 15
For let be the projection of onto the line spanned by , let be the projection of onto the line spanned by , let be the projection of onto the line spanned by , etc., back and forth between the spans of and . That is, is the projection of onto the span of if is even, and onto the span of if is odd. Must that sequence of vectors eventually settle down— must there be a sufficiently large such that equals and equals ? If so, what is the earliest such ?
2 - Gram-Schmidt Orthogonalization
This subsection is optional. It requires material from the prior, also optional, subsection. The work done here will only be needed in the final two sections of Chapter Five.
The prior subsection suggests that projecting onto the line spanned by decomposes a vector into two parts
![]() |
that are orthogonal and so are "not interacting". We will now develop that suggestion.
- Definition 2.1
Vectors are mutually orthogonal when any two are orthogonal: if then the dot product is zero.
- Theorem 2.2
If the vectors in a set are mutually orthogonal and nonzero then that set is linearly independent.
- Proof
Consider a linear relationship . If then taking the dot product of with both sides of the equation
shows, since is nonzero, that is zero.
- Corollary 2.3
If the vectors in a size subset of a dimensional space are mutually orthogonal and nonzero then that set is a basis for the space.
- Proof
Any linearly independent size subset of a dimensional space is a basis.
Of course, the converse of Corollary 2.3 does not hold— not every basis of every subspace of is made of mutually orthogonal vectors. However, we can get the partial converse that for every subspace of there is at least one basis consisting of mutually orthogonal vectors.
- Example 2.4
The members and of this basis for are not orthogonal.
![]() |
However, we can derive from a new basis for the same space that does have mutually orthogonal members. For the first member of the new basis we simply use .
For the second member of the new basis, we take away from its part in the direction of ,
![]() |
which leaves the part, pictured above, of that is orthogonal to (it is orthogonal by the definition of the projection onto the span of ). Note that, by the corollary, is a basis for .
- Definition 2.5
An orthogonal basis for a vector space is a basis of mutually orthogonal vectors.
- Example 2.6
To turn this basis for
into an orthogonal basis, we take the first vector as it is given.
We get by starting with the given second vector and subtracting away the part of it in the direction of .
Finally, we get by taking the third given vector and subtracting the part of it in the direction of , and also the part of it in the direction of .
Again the corollary gives that
is a basis for the space.
The next result verifies that the process used in those examples works with any basis for any subspace of an (we are restricted to only because we have not given a definition of orthogonality for other vector spaces).
- Theorem 2.7 (Gram-Schmidt orthogonalization)
If is a basis for a subspace of then, where
the 's form an orthogonal basis for the same subspace.
- Proof
We will use induction to check that each is nonzero, is in the span of and is orthogonal to all preceding vectors: . With those, and with Corollary 2.3, we will have that is a basis for the same space as .
We shall cover the cases up to , which give the sense of the argument. Completing the details is Problem 15.
The case is trivial— setting equal to makes it a nonzero vector since is a member of a basis, it is obviously in the desired span, and the "orthogonal to all preceding vectors" condition is vacuously met.
For the case, expand the definition of .
This expansion shows that is nonzero or else this would be a non-trivial linear dependence among the 's (it is nontrivial because the coefficient of is ) and also shows that is in the desired span. Finally, is orthogonal to the only preceding vector
because this projection is orthogonal.
The case is the same as the case except for one detail. As in the case, expanding the definition
shows that is nonzero and is in the span. A calculation shows that is orthogonal to the preceding vector .
(Here's the difference from the case— the second line has two kinds of terms. The first term is zero because this projection is orthogonal, as in the case. The second term is zero because is orthogonal to and so is orthogonal to any vector in the line spanned by .) The check that is also orthogonal to the other preceding vector is similar.
Beyond having the vectors in the basis be orthogonal, we can do more; we can arrange for each vector to have length one by dividing each by its own length (we can normalize the lengths).
- Example 2.8
Normalizing the length of each vector in the orthogonal basis of Example 2.6 produces this orthonormal basis.
Besides its intuitive appeal, and its analogy with the standard basis for , an orthonormal basis also simplifies some computations. See Exercise 9, for example.
Exercises
- Problem 1
Perform the Gram-Schmidt process on each of these bases for .
Then turn those orthogonal bases into orthonormal bases.
- This exercise is recommended for all readers.
- Problem 2
Perform the Gram-Schmidt process on each of these bases for .
Then turn those orthogonal bases into orthonormal bases.
- This exercise is recommended for all readers.
- Problem 3
Find an orthonormal basis for this subspace of : the plane .
- Problem 4
Find an orthonormal basis for this subspace of .
- Problem 5
Show that any linearly independent subset of can be orthogonalized without changing its span.
- This exercise is recommended for all readers.
- Problem 6
What happens if we apply the Gram-Schmidt process to a basis that is already orthogonal?
- Problem 7
Let be a set of mutually orthogonal vectors in .
- Prove that for any in the space, the vector is orthogonal to each of , ..., .
- Illustrate the prior item in by using as , using as , and taking to have components , , and .
- Show that is the vector in the span of the set of 's that is closest to . Hint. To the illustration done for the prior part, add a vector and apply the Pythagorean Theorem to the resulting triangle.
- Problem 8
Find a vector in that is orthogonal to both of these.
- This exercise is recommended for all readers.
- Problem 9
One advantage of orthogonal bases is that they simplify finding the representation of a vector with respect to that basis.
- For this vector and this non-orthogonal basis for
- With this orthogonal basis for
- Let be an orthogonal basis for some subspace of . Prove that for any in the subspace, the -th component of the representation is the scalar coefficient from .
- Prove that .
- Problem 10
Bessel's Inequality. Consider these orthonormal sets
along with the vector whose components are , , , and .
- Find the coefficient for the projection of onto the span of the vector in . Check that .
- Find the coefficients and for the projection of onto the spans of the two vectors in . Check that .
- Find , , and associated with the vectors in , and , , , and for the vectors in . Check that and that .
Show that this holds in general: where is an orthonormal set and is coefficient of the projection of a vector from the space then . Hint. One way is to look at the inequality and expand the 's.
- Problem 11
Prove or disprove: every vector in is in some orthogonal basis.
- Problem 12
Show that the columns of an matrix form an orthonormal set if and only if the inverse of the matrix is its transpose. Produce such a matrix.
- Problem 13
Does the proof of Theorem 2.2 fail to consider the possibility that the set of vectors is empty (i.e., that )?
- Problem 14
Theorem 2.7 describes a change of basis from any basis to one that is orthogonal . Consider the change of basis matrix .
- Prove that the matrix changing bases in the direction opposite to that of the theorem has an upper triangular shape— all of its entries below the main diagonal are zeros.
- Prove that the inverse of an upper triangular matrix is also upper triangular (if the matrix is invertible, that is). This shows that the matrix changing bases in the direction described in the theorem is upper triangular.
- Problem 15
Complete the induction argument in the proof of Theorem 2.7.
3 - Projection Onto a Subspace
This subsection, like the others in this section, is optional. It also requires material from the optional earlier subsection on Combining Subspaces.
The prior subsections project a vector onto a line by decomposing it into two parts: the part in the line and the rest . To generalize projection to arbitrary subspaces, we follow this idea.
- Definition 3.1
For any direct sum and any , the projection of onto along is
where with .
This definition doesn't involve a sense of "orthogonal" so we can apply it to spaces other than subspaces of an . (Definitions of orthogonality for other spaces are perfectly possible, but we haven't seen any in this book.)
- Example 3.2
The space of matrices is the direct sum of these two.
To project
onto along , we first fix bases for the two subspaces.
The concatenation of these
is a basis for the entire space, because the space is the direct sum, so we can use it to represent .
Now the projection of onto along is found by keeping the part of this sum and dropping the part.
- Example 3.3
Both subscripts on are significant. The first subscript matters because the result of the projection is an , and changing this subspace would change the possible results. For an example showing that the second subscript matters, fix this plane subspace of and its basis
and compare the projections along two different subspaces.
(Verification that and is routine.) We will check that these projections are different by checking that they have different effects on this vector.
For the first one we find a basis for
and represent with respect to the concatenation .
The projection of onto along is found by keeping the part and dropping the part.
For the other subspace , this basis is natural.
Representing with respect to the concatenation
and then keeping only the part gives this.
Therefore projection along different subspaces may yield different results.
These pictures compare the two maps. Both show that the projection is indeed "onto" the plane and "along" the line.
![]() | ![]() |
Notice that the projection along is not orthogonal— there are members of the plane that are not orthogonal to the dotted line. But the projection along is orthogonal.
A natural question is: what is the relationship between the projection operation defined above, and the operation of orthogonal projection onto a line? The second picture above suggests the answer— orthogonal projection onto a line is a special case of the projection defined above; it is just projection along a subspace perpendicular to the line.
In addition to pointing out that projection along a subspace is a generalization, this scheme shows how to define orthogonal projection onto any subspace of