Commutative Algebra/The Cayley–Hamilton theorem and Nakayama's lemma

Determinants within a commutative ring

edit

We shall now derive the notion of a determinant in the setting of a commutative ring.

Definition 7.1 (Determinant):

Let   be a commutative ring, and let  . A determinant is a function   satisfying the following three axioms:

  1.  , where   is the   identity matrix.
  2. If   is a matrix such that two adjacent columns are equal, then  .
  3. For each   we have  , where   are columns and  .

We shall later see that there exists exactly one determinant.

Theorem 7.2 (Properties of a (the) determinant):

  1. If   has a column consisting entirely of zeroes, then  .
  2. If   is a matrix, and one adds a multiple of one column to an adjacent column, then   does not change.
  3. If two adjacent columns of   are exchanged, then   is multiplied by  .
  4. If any two columns of a matrix   are exchanged, then   is multiplied by  .
  5. If   is a matrix, and one adds a multiple of one column to any other column, then   does not change.
  6. If   is a matrix that has two equal columns, then  .
  7. Let   be a permutation, where   is the  -th symmetric group. If  , then  .

Proofs:

1. Let  , where the  -th column   is the zero vector. Then by axiom 3 for the determinant setting  ,

 .

Alternatively, we may also set   and   to obtain

 ,

from which the theorem follows by subtracting   from both sides.

Those proofs correspond to the proofs for   for a linear map   (in whatever context).

2. If we set   or   (dependent on whether we add the column left or the column right to the current column), then axiom 3 gives us

 ,

where the latter determinant is zero because we have to adjacent equal columns.

3. Consider the two matrices   and  . By 7.2, 2. and axiom 3 for determinants, we have

 .

4. We exchange the  -th and  -th column by first moving the  -th column successively to spot   (using   swaps) and the  -th column, which is now one step closer to the  -th spot, to spot   using   swaps. In total, we used an odd number of swaps, and all the other columns are in the same place since they moved once to the right and once to the left. Hence, 4. follows from applying 3. to each swap.

5. Let's say we want to add   to the  -th column. Then we first use 4. to put the  -th column adjacent to  , then use 2. to do the addition without change to the determinant, and then use 4. again to put the  -th column back to its place. In total, the only change our determinant has suffered was twice multiplication by  , which cancels even in a general ring.

6. Let's say that the  -th column and the  -th column are equal,  . Then we subtract column   from column   (or, indeed, the other way round) without change to the determinant, obtain a matrix with a zero column and apply 1.

7. Split   into swaps, use 4. repeatedly and use further that   is a group homomorphism. 

Note that we have only used axioms 2 & 3 for the preceding proof.

The following lemma will allow us to prove the uniqueness of the determinant, and also the formula  .

Lemma 7.3:

Let   and   be two   matrices with entries in a commutative ring  . Then

 .

Proof:

The matrix   has  -th columns  . Hence, by axiom 3 for determinants and theorem 7.2, 7. and 6., we obtain, denoting  :

  

Theorem 7.4 (Uniqueness of the determinant):

For each commutative ring, there is at most one determinant, and if it exists, it equals

 .

Proof:

Let   be an arbitrary matrix, and set   and   in lemma 7.3. Then we obtain by axiom 1 for determinants (the first time we use that axiom)

 . 

Theorem 7.5 (Multiplicativity of the determinant):

If   is a determinant, then

 .

Proof:

From lemma 7.3 and theorem 7.4 we may infer

 . 

Theorem 7.6 (Existence of the determinant):

Let   be a commutative ring. Then

 

is a determinant.

Proof:

First of all,   has nonzero entries everywhere except on the diagonal. Hence, if  , then   vanishes except  , i.e.   is the identity. Hence  .

Let now   be a matrix whose  -th and  -th columns are equal. The function

 

is bijective, since the inverse is given by   itself. Furthermore, since   amounts to composing   with another swap, it is sign reversing. Hence, we have

 .

Now since the  -th and  -th column of   are identical,  . Hence  .

Linearity follows from the linearity of each summand:

 . 

Theorem 7.7:

The determinant of any matrix equals the determinant of the transpose of that matrix.

Proof:

Observe that inversion is a bijection on   the inverse of which is given by inversion ( ). Further observe that  , since we just apply all the transpositions in reverse order. Hence,

 . 

Theorem 7.8 (column expansion):

Let   be an   matrix over a commutative ring  . For   define   to be the   matrix obtained by crossing out the  -th row and  -th column from  . Then for any   we have

 .

Proof 1:

We prove the theorem from the formula for the determinant given by theorems 7.5 and 7.6.

Let   be fixed. For each  , we define

 .

Then

  

Proof 2:

We note that all of the above derivations could have been done with rows instead of columns (which amounts to nothing more than exchanging   with   each time), and would have ended up with the same formula for the determinant since

 

as argued in theorem 7.7.

Hence, we prove that the function   given by the formula   satisfies 1 - 3 of 7.1 with rows instead of columns, and then apply theorem 7.4 with rows instead of columns.

1.

Set   to obtain

 .

2.

Let   have two equal adjacent rows, the  -th and  -th, say. Then

 ,

since each of the   has two equal adjacent rows except for possibly   and  , which is why, by theorem 7.6, the determinant is zero in all those cases, and further  , since in both we deleted "the same" row.

3.

Define  , and for each   define   as the matrix obtained by crossing out the  -th row and the  -th column from the matrix  . Then by theorem 7.6 and axiom 3 for the determinant,

 .

Hence follows linearity by rows. 

For the sake of completeness, we also note the following lemma:

Lemma 7.9:

Let   be an invertible matrix. Then   is invertible.

Proof:

Indeed,   due to the multiplicativity of the determinant. 

The converse is also true and will be proven in the next subsection.

Exercises

edit
  • Exercise 7.1.1: Argue that the determinant, seen as a map from the set of all matrices (where scalars are  -matrices), is idempotent.

Cramer's rule in the general case

edit

Theorem 7.10 (Cramer's rule, solution of linear equations):

Let   be a commutative ring, let   be a matrix with entries in   and let   be a vector. If   is invertible, the unique solution to   is given by

 ,

where   is obtained by replacing the  -th column of   by  .

Proof 1:

Let   be arbitrary but fixed. The determinant of   is linear in the first column, and hence constitutes a linear map in the first column   mapping any vector to the determinant of   with the  -th column replaced by that vector. If   is the  -th column of  ,  . Furthermore, if we insert a different column   into  , we obtain zero, since we obtain the determinant of a matrix where the column   appears twice. We now consider the system of equations

 

where   is the unique solution of the system  , which exists since it is given by   since   is invertible. Since   is linear, we find an   matrix   such that for all  

 ;

in fact, due to theorem 7.8,  . We now add up the lines of the linear equation system above in the following way: We take   times the first row, add   times the second row and so on. Due to our considerations, this yields the result

 .

Due to lemma 7.9,   is invertible. Hence, we get

 

and hence the theorem. 

Proof 2:

For all  , we define the matrix

 

this matrix shall represent a unit matrix, where the  -th column is replaced by the vector  . By expanding the  -th column, we find that the determinant of this matrix is given by  .

We now note that if  , then  . Hence

 ,

where the last equality follows as in lemma 7.9. 

Theorem 7.11 (Cramer's rule, matrix inversion):

Let   be an   matrix with entries in a ring  . We recall that the cofactor matrix   of   is the matrix with  -th entry

 ,

where   is obtained from   by crossing out the  -th row and  -th column. We further recall that the adjugate matrix   was given by

 .

With this definition, we have

 .

In particular, if   is a unit within  , then   is invertible and

 .

Proof:

For  , we set  , where the zero is at the  -th place. Further, we set   to be the linear function from proof 1 of theorem 7.10, and   its matrix. Then   is given by

 

due to theorem 7.8. Hence,

 

where we used the properties of   established in proof 1 of theorem 7.10. 

The theorems

edit

Now we may finally apply the machinery we have set up to prove the following two fundamental theorems.

Theorem 7.12 (the Cayley–Hamilton theorem):

Let   be a finitely generated  -module, let   be a module morphism and let   be an Ideal of   such that  . Then there exist   and   such that

 ;

this equation is to be read as

 ,

where   means applying   to     times.

Note that the polynomial in   is monic, that is, the leading coefficient is  , the unit of the ring in question.

Proof: Assume that   is a generating set for  . Since  , we may write

  (*),

where   for each  . We now define a new commutative ring as follows:

 ,

where we regard each element   of   as the endomorphism   on  . That is,   is a subring of the endomorphism ring of   (that is, multiplication is given by composition). Since   is  -linear,   is commutative.

Now to every   matrix   with entries in   we may associate a function

 .

By exploiting the linearities of all functions involved, it is easy to see that for another   matrix with entries in   called  , the associated function of   equals the composition of the associated functions of   and  ; that is,  .

Now with this in mind, we may rewrite the system (*) as follows:

 ,

where   has  -th entry  . Now define  . From Cramer's rule (theorem 7.11) we obtain that

 ,

which is why

 , the zero vector.

Hence,   is the zero mapping, since it sends all generators to zero. Now further, as can be seen e.g. from the representation given in theorem 7.4, it has the form

 

for suitable  . 

Theorem 7.13 (Nakayama's lemma):

Let   be a ring,   a finitely generated  -module and   an ideal such that  . Then there exists an   such that  .

Proof:

Choose   in theorem 7.12 to obtain for   that

 

for suitable  , since the identity is idempotent.