Linear Algebra/Gauss-Jordan Reduction
Gaussian elimination coupled with back-substitution solves linear systems, but it's not the only method possible. Here is an extension of Gauss' method that has some advantages.
- Example 1.1
To solve
we can start by going to echelon form as usual.
We can keep going to a second stage by making the leading entries into ones
and then to a third stage that uses the leading entries to eliminate all of the other entries in each column by pivoting upwards.
The answer is , , and .
Note that the pivot operations in the first stage proceed from column one to column three while the pivot operations in the third stage proceed from column three to column one.
- Example 1.2
We often combine the operations of the middle stage into a single step, even though they are operations on different rows.
The answer is and .
This extension of Gauss' method is Gauss-Jordan reduction. It goes past echelon form to a more refined, more specialized, matrix form.
- Definition 1.3
A matrix is in reduced echelon form if, in addition to being in echelon form, each leading entry is a one and is the only nonzero entry in its column.
The disadvantage of using Gauss-Jordan reduction to solve a system is that the additional row operations mean additional arithmetic. The advantage is that the solution set can just be read off.
In any echelon form, plain or reduced, we can read off when a system has an empty solution set because there is a contradictory equation, we can read off when a system has a one-element solution set because there is no contradiction and every variable is the leading variable in some row, and we can read off when a system has an infinite solution set because there is no contradiction and at least one variable is free.
In reduced echelon form we can read off not just what kind of solution set the system has, but also its description. Whether or not the echelon form is reduced, we have no trouble describing the solution set when it is empty, of course. The two examples above show that when the system has a single solution then the solution can be read off from the right-hand column. In the case when the solution set is infinite, its parametrization can also be read off of the reduced echelon form. Consider, for example, this system that is shown brought to echelon form and then to reduced echelon form.
Starting with the middle matrix, the echelon form version, back substitution produces so that , then another back substitution gives implying that , and then the final back substitution gives implying that . Thus the solution set is this.
Now, considering the final matrix, the reduced echelon form version, note that adjusting the parametrization by moving the terms to the other side does indeed give the description of this infinite solution set.
Part of the reason that this works is straightforward. While a set can have many parametrizations that describe it, e.g., both of these also describe the above set (take to be and to be )
nonetheless we have in this book stuck to a convention of parametrizing using the unmodified free variables (that is, instead of ). We can easily see that a reduced echelon form version of a system is equivalent to a parametrization in terms of unmodified free variables. For instance,
(to move from left to right we also need to know how many equations are in the system). So, the convention of parametrizing with the free variables by solving each equation for its leading variable and then eliminating that leading variable from every other equation is exactly equivalent to the reduced echelon form conditions that each leading entry must be a one and must be the only nonzero entry in its column.
Not as straightforward is the other part of the reason that the reduced echelon form version allows us to read off the parametrization that we would have gotten had we stopped at echelon form and then done back substitution. The prior paragraph shows that reduced echelon form corresponds to some parametrization, but why the same parametrization? A solution set can be parametrized in many ways, and Gauss' method or the Gauss-Jordan method can be done in many ways, so a first guess might be that we could derive many different reduced echelon form versions of the same starting system and many different parametrizations. But we never do. Experience shows that starting with the same system and proceeding with row operations in many different ways always yields the same reduced echelon form and the same parametrization (using the unmodified free variables).
In the rest of this section we will show that the reduced echelon form version of a matrix is unique. It follows that the parametrization of a linear system in terms of its unmodified free variables is unique because two different ones would give two different reduced echelon forms.
We shall use this result, and the ones that lead up to it, in the rest of the book but perhaps a restatement in a way that makes it seem more immediately useful may be encouraging. Imagine that we solve a linear system, parametrize, and check in the back of the book for the answer. But the parametrization there appears different. Have we made a mistake, or could these be different-looking descriptions of the same set, as with the three descriptions above of ? The prior paragraph notes that we will show here that different-looking parametrizations (using the unmodified free variables) describe genuinely different sets.
Here is an informal argument that the reduced echelon form version of a matrix is unique. Consider again the example that started this section of a matrix that reduces to three different echelon form matrices. The first matrix of the three is the natural echelon form version. The second matrix is the same as the first except that a row has been halved. The third matrix, too, is just a cosmetic variant of the first. The definition of reduced echelon form outlaws this kind of fooling around. In reduced echelon form, halving a row is not possible because that would change the row's leading entry away from one, and neither is combining rows possible, because then a leading entry would no longer be alone in its column.
This informal justification is not a proof; we have argued that no two different reduced echelon form matrices are related by a single row operation step, but we have not ruled out the possibility that multiple steps might do. Before we go to that proof, we finish this subsection by rephrasing our work in a terminology that will be enlightening.
Many different matrices yield the same reduced echelon form matrix. The three echelon form matrices from the start of this section, and the matrix they were derived from, all give this reduced echelon form matrix.
We think of these matrices as related to each other. The next result speaks to this relationship.
- Lemma 1.4
Elementary row operations are reversible.
- Proof
For any matrix , the effect of swapping rows is reversed by swapping them back, multiplying a row by a nonzero is undone by multiplying by , and adding a multiple of row to row (with ) is undone by subtracting the same multiple of row from row .
(The conditions is needed. See Problem 7.)
This lemma suggests that "reduces to" is misleading— where , we shouldn't think of as "after" or "simpler than" . Instead we should think of them as interreducible or interrelated. Below is a picture of the idea. The matrices from the start of this section and their reduced echelon form version are shown in a cluster. They are all interreducible; these relationships are shown also.
We say that matrices that reduce to each other are "equivalent with respect to the relationship of row reducibility". The next result verifies this statement using the definition of an equivalence.[1]
- Lemma 1.5
Between matrices, "reduces to" is an equivalence relation.
- Proof
We must check the conditions (i) reflexivity, that any matrix reduces to itself, (ii) symmetry, that if reduces to then reduces to , and (iii) transitivity, that if reduces to and reduces to then reduces to .
Reflexivity is easy; any matrix reduces to itself in zero row operations.
That the relationship is symmetric is Lemma 1.4— if reduces to by some row operations then also reduces to by reversing those operations.
For transitivity, suppose that reduces to and that reduces to . Linking the reduction steps from with those from gives a reduction from to .
- Definition 1.6
Two matrices that are interreducible by the elementary row operations are row equivalent.
The diagram below shows the collection of all matrices as a box. Inside that box, each matrix lies in some class. Matrices are in the same class if and only if they are interreducible. The classes are disjoint— no matrix is in two distinct classes. The collection of matrices has been partitioned into row equivalence classes.[2]
One of the classes in this partition is the cluster of matrices shown above, expanded to include all of the nonsingular matrices.
The next subsection proves that the reduced echelon form of a matrix is unique; that every matrix reduces to one and only one reduced echelon form matrix. Rephrased in terms of the row-equivalence relationship, we shall prove that every matrix is row equivalent to one and only one reduced echelon form matrix. In terms of the partition what we shall prove is: every equivalence class contains one and only one reduced echelon form matrix. So each reduced echelon form matrix serves as a representative of its class.
After that proof we shall, as mentioned in the introduction to this section, have a way to decide if one matrix can be derived from another by row reduction. We just apply the Gauss-Jordan procedure to both and see whether or not they come to the same reduced echelon form.
Exercises
edit- This exercise is recommended for all readers.
- Problem 1
Use Gauss-Jordan reduction to solve each system.
- This exercise is recommended for all readers.
- Problem 2
Find the reduced echelon form of each matrix.
- This exercise is recommended for all readers.
- Problem 3
Find each solution set by using Gauss-Jordan reduction, then reading off the parametrization.
- Problem 4
Give two distinct echelon form versions of this matrix.
- This exercise is recommended for all readers.
- Problem 5
List the reduced echelon forms possible for each size.
- This exercise is recommended for all readers.
- Problem 6
What results from applying Gauss-Jordan reduction to a nonsingular matrix?
- Problem 7
The proof of Lemma 1.4 contains a reference to the condition on the row pivoting operation.
- The definition of row operations has an condition on the swap operation . Show that in this condition is not needed.
- Write down a matrix with nonzero entries, and show that the operation is not reversed by .
- Expand the proof of that lemma to make explicit exactly where the condition on pivoting is used.