Topics in Abstract Algebra

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Topics_in_Abstract_Algebra

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Non-commutative rings

A ring is not necessarily commutative but is assumed to have the multiplicative identity.

Proposition.

Let

R

be a simple ring. Then: every morphism

R\to R

is either zero or an isomorphism. (Schur's lemma)

Theorem (Levitzky).

Let

R

be a right noetherian ring. Then every (left or right) nil ideal is nilpotent.

Commutative algebra

The set of all prime ideals in a commutative ring $A$ is called the spectrum of $A$ and denoted by $\operatorname {Spec} (A)$ . (The motivation for the term comes from the theory of a commutative Banach algebra.)

Spec A

The set of all nilpotent elements in $A$ forms an ideal called the nilradical of $A$ . Given any ideal ${\mathfrak {a}}$ , the pre-image of the nilradical of $A$ is an ideal called the radical of ${\mathfrak {a}}$ and denoted by ${\sqrt {\mathfrak {a}}}$ . Explicitly, $x\in {\sqrt {\mathfrak {a}}}$ if and only if $x^{n}\in {\mathfrak {a}}$ for some $n$ .

Proposition A.14.

Let

{\mathfrak {i}},{\mathfrak {j}}\triangleleft A

.

(i) ${\sqrt {{\mathfrak {i}}^{n}}}={\sqrt {\mathfrak {i}}}$
(ii) ${\sqrt {{\mathfrak {i}}{\mathfrak {j}}}}={\sqrt {{\mathfrak {i}}\cap {\mathfrak {j}}}}={\sqrt {\mathfrak {i}}}\cap {\sqrt {\mathfrak {j}}}$

Proof. Routine. $\square$

Exercise.

A ring has only one prime ideal if and only if its nilradical is maximal.

Exercise.

Every prime ideal in a finite ring is maximal.

Proposition A.2.

Let

A\neq 0

be a ring. If every principal ideal in

A

is prime, then

A

is a field.

Proof. Let $0\neq x\in A$ . Since $x^{2}$ is in $(x^{2})$ , which is prime, $x\in (x^{2})$ . Thus, we can write $x=ax^{2}$ . Since $(0)$ is prime, $A$ is a domain. Hence, $1=ax$ . $\square$

Lemma.

Let

{\mathfrak {p}}\triangleleft A

. Then

{\mathfrak {p}}

is prime if and only if

{\mathfrak {p}}\subsetneq {\mathfrak {a}}\triangleleft A,{\mathfrak {p}}\subsetneq {\mathfrak {b}}\triangleleft A

implies

{\mathfrak {a}}{\mathfrak {b}}\not \subset {\mathfrak {p}}

Proof. ( $\Rightarrow$ ) Clear. ( $\Leftarrow$ ) Let ${\overline {x}}$ be the image of $x\in A$ in $A/{\mathfrak {p}}$ . Suppose ${\overline {a}}$ is a zero-divisor; that is, ${\overline {a}}{\overline {b}}=0$ for some $b\in A\backslash {\mathfrak {p}}$ . Let ${\mathfrak {a}}=(a,{\mathfrak {p}})$ , and ${\mathfrak {b}}=(b,{\mathfrak {p}})$ . Since ${\mathfrak {a}}{\mathfrak {b}}=ab+{\mathfrak {p}}\subset {\mathfrak {p}}$ , and ${\mathfrak {b}}$ is strictly larger than ${\mathfrak {p}}$ , by the hypothesis, ${\mathfrak {a}}\subset {\mathfrak {p}}$ . That is, ${\overline {a}}=0$ . $\square$

Theorem A.11 (multiplicative avoidance).

Let

S\subset A

be a multiplicative system. If

{\mathfrak {a}}\triangleleft A

is disjoint from

S

, then there exists a prime ideal

{\mathfrak {p}}\supset {\mathfrak {a}}

that is maximal among ideals disjoint from

S

.

Proof. Let ${\mathfrak {m}}$ be a maximal element in the set of all ideals disjoint from $S$ . Let ${\mathfrak {a}}$ and ${\mathfrak {b}}$ be ideals strictly larger than ${\mathfrak {m}}$ . Since ${\mathfrak {m}}$ is maximal, we find $a\in {\mathfrak {a}}\cap S$ and $b\in {\mathfrak {b}}\cap S$ . By the definition of $S$ , $ab\in S$ ; thus, ${\mathfrak {a}}{\mathfrak {b}}\not \subset {\mathfrak {m}}$ . By the lemma, ${\mathfrak {m}}$ is prime then. $\square$

Note that the theorem applies in particular when $S$ contains only 1.

Exercise.

A domain A is a principal ideal domain if every prime ideal is principal.

A Goldman domain is a domain whose field of fractions $K$ is finitely generated as an algebra. When $A$ is a Goldman domain, K always has the form $A[f^{-1}]$ . Indeed, if $K=A[s_{1}^{-1},...,s_{n}^{-1}]$ , let $s=s_{1}...s_{n}$ . Then $K=A[s^{-1}]$ .

Lemma.

Let

A

be a domain with the field of fractions

K

, and

0\neq f\in A

. Then

K=A[f^{-1}]

if and only if every nonzero prime ideal of

A

contains

f

.

Proof. ( $\Leftarrow$ ) Let $0\neq x\in A$ , and $S=\{f^{n}|n\geq 0\}$ . If $(x)$ is disjoint from $S$ , then, by the lemma, there is a prime ideal disjoint from $S$ , contradicting the hypothesis. Thus, $(x)$ contains some power of $f$ , say, $yx=f^{n}$ . Then $yx$ and so $x$ are invertible in $A[f^{-1}].$ ( $\Rightarrow$ ) If ${\mathfrak {p}}$ is a nonzero prime ideal, it contains a nonzero element, say, $s$ . Then we can write: $1/s=a/f^{n}$ , or $f^{n}=as\in {\mathfrak {p}}$ ; thus, $f\in {\mathfrak {p}}$ . $\square$

A prime ideal ${\mathfrak {p}}\in \operatorname {Spec} (A)$ is called a Goldman ideal if $A/{\mathfrak {p}}$ is a Goldman domain.

Theorem A.21.

Let

A

be a ring and

{\mathfrak {a}}\triangleleft A

. Then

{\sqrt {\mathfrak {a}}}

is the intersection of all minimal Goldman ideals of A containing

{\mathfrak {a}}

Proof. By the ideal correspondence, it suffices to prove the case ${\mathfrak {a}}={\sqrt {\mathfrak {a}}}=0$ . Let $0\neq f\in A$ . Let $S=\{f^{n}|n\geq 0\}$ . Since $f$ is not nilpotent (or it will be in ${\sqrt {(0)}}$ ), by multiplicative avoidance, there is some prime ideal ${\mathfrak {g}}$ not containing $f$ . It remains to show it is a Goldman ideal. But if ${\mathfrak {p}}\triangleleft A/{\mathfrak {g}}$ is a nonzero prime, then $f\in {\mathfrak {p}}$ since ${\mathfrak {p}}$ collapses to zero if it is disjoint from $S$ . By Lemma, the field of fractions of $A/{\mathfrak {g}}$ is obtained by inverting $f$ and so ${\mathfrak {g}}$ is a Goldman ideal. Hence, the intersection of all Goldman ideals reduces to zero. $\square$

In some rings, Goldman ideals are maximal; this will be discussed in the next section. On the other hand,

Lemma.

Let

{\mathfrak {a}}\triangleleft A

. Then

{\mathfrak {a}}

is a Goldman ideal if and only if it is the contraction of a maximal ideal in

A[X]

.

Theorem.

The following are equivalent.

For any ${\mathfrak {a}}\triangleleft A$ , ${\mathfrak {a}}$ is the intersection of all maximal ideals containing ${\mathfrak {a}}$ .
Every Goldman ideal is maximal.
Every maximal ideal in $A[X]$ contracts to a maximal ideal in $A$ .

Proof. Clear. $\square$

A ring satisfying the equivalent conditions in the theorem is called a Hilbert-Jacobson ring.

Lemma.

Let

A\subset B

be domains such that

B

is algebraic and of finite type over

A

. Then

A

is a Goldman domain if and only if

B

is a Goldman domain.

Proof. Let $K\subset L$ be the fields of fractions of $A$ and $B$ , respectively. $\square$

Theorem A.19.

Let

A

be a Hilbert-Jacobson ring. Then

A[X]

is a Hilbert-Jacobson ring.

Proof. Let ${\mathfrak {q}}\triangleleft A[X]$ be a Goldman ideal, and ${\mathfrak {p}}={\mathfrak {q}}\cap A$ . It follows from Lemma something that $A/{\mathfrak {p}}$ is a Goldman domain since it is contained in a $A[X]/{\mathfrak {q}}$ , a Goldman domain. Since $A$ is a Hilbert-Jacobson ring, ${\mathfrak {p}}$ is maximal and so $A/{\mathfrak {p}}$ is a field and so $A[X]/{\mathfrak {q}}$ is a field; that is, ${\mathfrak {q}}$ is maximal. $\square$

To do:
Explain why $A/{\mathfrak {p}}$ is a field (or point to a location where it can be understood why it is so...).

Theorem A.5 (prime avoidance).

Let

{\mathfrak {p}}_{1},...,{\mathfrak {p}}_{r}\triangleleft A

be ideals, at most two of which are not prime, and

{\mathfrak {a}}\triangleleft A

. If

{\mathfrak {a}}\subset \bigcup _{1}^{r}{\mathfrak {p}}_{i}

, then

{\mathfrak {a}}\subset {\mathfrak {p}}_{i}

for some

{\mathfrak {i}}

.

Proof. We shall induct on $r$ to find $a\in {\mathfrak {a}}$ that is in no ${\mathfrak {p}}_{i}$ . The case $r=1$ being trivial, suppose we find $a\in {\mathfrak {a}}$ such that $a\not \in {\mathfrak {p}}_{i}$ for $i<r$ . We assume $a\in {\mathfrak {p}}_{r}$ ; else, we're done. Moreover, if ${\mathfrak {p}}_{i}\subset {\mathfrak {p}}_{r}$ for some $i<r$ , then the theorem applies without ${\mathfrak {p}}_{i}$ and so this case is done by by the inductive hypothesis. We thus assume ${\mathfrak {p}}_{i}\not \subset {\mathfrak {p}}_{r}$ for all $i<r$ . Now, ${\mathfrak {a}}{\mathfrak {p}}_{1}...{\mathfrak {p}}_{r-1}\not \subset {\mathfrak {p}}_{r}$ ; if not, since ${\mathfrak {p}}_{r}$ is prime, one of the ideals in the left is contained in ${\mathfrak {p}}_{r}$ , contradiction. Hence, there is $b$ in the left that is not in ${\mathfrak {p}}_{r}$ . It follows that $a+b\not \in {\mathfrak {p}}_{i}$ for all $i\leq r$ . Finally, we remark that the argument works without assuming ${\mathfrak {p}}_{1}$ and ${\mathfrak {p}}_{2}$ are prime. (TODO: too sketchy.) The proof is thus complete. $\square$

An element p of a ring is a prime if $(p)$ is prime, and is an irreducible if $p=xy\Rightarrow$ either $x$ or $y$ is a unit..

We write $x|y$ if $(x)\ni y$ , and say $x$ divides $y$ . In a domain, a prime element is irreducible. (Suppose $x=yz$ . Then either $x|y$ or $x|z$ , say, the former. Then $sx=y$ , and $sxz=x$ . Canceling $x$ out we see $z$ is a unit.) The converse is false in general. We have however:

Proposition.

Suppose: for every

x

and

y

,

(x)\cap (y)=(xy)

whenever (1) is the only principal ideal containing

(x,y)

. Then every irreducible is a prime.

Theorem A.16 (Chinese remainder theorem).

Let

{\mathfrak {a}}_{1},...,{\mathfrak {a}}_{n}\triangleleft A

. If

{\mathfrak {a}}_{j}+{\mathfrak {a}}_{i}=(1)

, then

\prod {\mathfrak {a}}_{i}\to A\to A/{\mathfrak {a}}_{1}\times \cdots \times {\mathfrak {a}}_{n}\to 0

is exact.

The Jacobson radical of a ring $A$ is the intersection of all maximal ideals.

Proposition A.6.

x\in A

is in the Jacobson radical if and only if

1-xy

is a unit for every

y\in A

.

Proof. Let $x$ be in the Jacobson radical. If $1-xy$ is not a unit, it is in a maximal ideal ${\mathfrak {m}}$ . But then we have: $1=(1-xy)+xy$ , which is a sum of elements in ${\mathfrak {m}}$ ; thus, in ${\mathfrak {m}}$ , contradiction. Conversely, suppose $x$ is not in the Jacobson radical; that is, it is not in some maximal ideal ${\mathfrak {m}}$ . Then $(x,{\mathfrak {m}})$ is an ideal containing ${\mathfrak {m}}$ but strictly larger. Thus, it contains $1$ , and we can write: $1=xy+z$ with $y\in A$ and $z\in {\mathfrak {m}}$ . Then $1-xy\in {\mathfrak {m}}$ , and ${\mathfrak {m}}$ would cease to be proper, unless $1-xy$ is a non-unit. $\square$

Note that the nilradical is contained in the Jacobson radical, and they coincide in particular if prime ideals are maximal (e.g., the ring is a principal ideal domain). Another instance of this is:

Exercise.

In

A[X]

, the nilradical and the Jacobson radical coincide.

Theorem A.17 (Hopkins).

Let A be a ring. Then the following are equivalent.

A is artinian
A is noetherian and every prime ideal is maximal.
$\operatorname {Spec} (A)$ is finite and discrete, and $A_{\mathfrak {m}}$ is noetherian for all maximal ideal ${\mathfrak {m}}$ .

Proof. (1) $\Rightarrow$ (3): Let ${\mathfrak {p}}\triangleleft A$ be prime, and $x\in A/{\mathfrak {p}}$ . Since $A/{\mathfrak {p}}$ is artinian (consider the short exact sequence), the descending sequence $(x^{n})$ stabilizes eventually; i.e., $x^{n}=ux^{n+1}$ for some unit u. Since $A/{\mathfrak {p}}$ is a domain, $x$ is a unit then. Hence, ${\mathfrak {p}}$ is maximal and so $\operatorname {Spec} (A)$ is discrete. It remains to show that it is finite. Let $S$ be the set of all finite intersections of maximal ideals. Let ${\mathfrak {i}}\in S$ be its minimal element, which we have by (1). We write ${\mathfrak {i}}={\mathfrak {m}}_{1}\cap ...\cap {\mathfrak {m}}_{n}$ . Let ${\mathfrak {m}}$ be an arbitrary maximal ideal. Then ${\mathfrak {m}}\cap {\mathfrak {i}}\in S$ and so ${\mathfrak {m}}\cap {\mathfrak {i}}={\mathfrak {i}}$ by minimality. Thus, ${\mathfrak {m}}={\mathfrak {m}}_{i}$ for some i. (3) $\Rightarrow$ (2): We only have to show $A$ is noetherian. $\square$

A ring is said to be local if it has only one maximal ideal.

Proposition A.17.

Let

A

be a nonzero ring. The following are equivalent.

$A$ is local.
For every $x\in A$ , either $x$ or $1-x$ is a unit.
The set of non-units is an ideal.

Proof. (1) $\Rightarrow$ (2): If $x$ is a non-unit, then $x$ is the Jacobson radical; thus, $1-x$ is a unit by Proposition A.6. (2) $\Rightarrow$ (3): Let $x,y\in A$ , and suppose $x$ is a non-unit. If $xy$ is a unit, then so are $x$ and $y$ . Thus, $xy$ is a non-unit. Suppose $x,y$ are non-units; we show that $x+y$ is a non-unit by contradiction. If $x+y$ is a unit, then there exists a unit $a\in A$ such that $1=a(x+y)=ax+ay$ . Thus either $ax$ or $1-ax=ay$ is a unit, whence either $x$ or $y$ is a unit, a contradiction. (3) $\Rightarrow$ (1): Let ${\mathfrak {i}}$ be the set of non-units. If ${\mathfrak {m}}\triangleleft A$ is maximal, it consists of nonunits; thus, ${\mathfrak {m}}\subset {\mathfrak {i}}$ where we have the equality by the maximality of ${\mathfrak {m}}$ . $\square$

Example.

If

p

is a prime ideal, then

A_{p}

is a local ring where

p

is its unique maximal ideal.

Example.

If

{\sqrt {\mathfrak {i}}}

is maximal, then

A/{\mathfrak {i}}

is a local ring. In particular,

A/{\mathfrak {m}}^{n},(n\geq 1)

is local for any maximal ideal

{\mathfrak {m}}

.

Let $(A,{\mathfrak {m}})$ be a local noetherian ring.

A. Lemma

(i) Let ${\mathfrak {i}}$ be a proper ideal of $A$ . If $M$ is a finite generated ${\mathfrak {i}}$ -module, then $M=0$ .
(ii) The intersection of all ${\mathfrak {m}}^{k}$ over $k\geq 1$ is trivial.

Proof: We prove (i) by the induction on the number of generators. Suppose $M$ cannot be generated by strictly less than $n$ generators, and suppose we have $x_{1},...x_{n}$ that generates $M$ . Then, in particular,

x_{1}=a_{1}x_{1}+a_{2}x_{2}+...+a_{n}x_{n}

where

a_{i}

are in

{\mathfrak {i}}

,

and thus

(1-a_{1})x_{1}=a_{2}x_{2}+...+a_{n}x_{n}

Since $a_{1}$ is not a unit, $1-a_{1}$ is a unit; in fact, if $1-a_{1}$ is not a unit, it belongs to a unique maximal ideal ${\mathfrak {m}}$ , which contains every non-units, in particular, $a_{1}$ , and thus $1\in {\mathfrak {m}}$ , which is nonsense. Thus we find that actually x_2, ..., x_n generates $M$ ; this contradicts the inductive hypothesis. $\square$

An ideal ${\mathfrak {q}}\triangleleft A$ is said to be primary if every zero-divisor in $A/{\mathfrak {q}}$ is nilpotent. Explicitly, this means that, whenever $xy\in {\mathfrak {q}}$ and $y\not \in {\mathfrak {q}}$ , $x\in {\sqrt {\mathfrak {q}}}$ . In particular, a prime ideal is primary.

Proposition.

If

{\mathfrak {q}}

is primary, then

{\sqrt {\mathfrak {q}}}

is prime. Conversely, if

{\sqrt {\mathfrak {q}}}

is maximal, then

{\mathfrak {q}}

is primary.

Proof. The first part is clear. Conversely, if ${\sqrt {\mathfrak {q}}}$ is maximal, then ${\mathfrak {m}}={\sqrt {\mathfrak {q}}}/{\mathfrak {q}}$ is a maximal ideal in $A/{\mathfrak {q}}$ . It must be unique and so $A/{\mathfrak {q}}$ is local. In particular, a zero-divisor in $A/{\mathfrak {q}}$ is nonunit and so is contained in ${\mathfrak {m}}$ ; hence, nilpotent. $\square$

Exercise.

{\sqrt {\mathfrak {q}}}

prime

\not \Rightarrow {\mathfrak {q}}

primary.

Theorem A.8 (Primary decomposition).

Let

A

be a noetherian ring. If

{\mathfrak {i}}\triangleleft A

, then

{\mathfrak {i}}

is a finite intersection of primary ideals.

Proof. Let $S$ be the set of all ideals that is not a finite intersection of primary ideals. We want to show $S$ is empty. Suppose not, and let ${\mathfrak {i}}$ be its maximal element. We can write ${\mathfrak {i}}$ as an intersection of two ideals strictly larger than ${\mathfrak {i}}$ . Indeed, since ${\mathfrak {i}}$ is not prime by definition in particular, choose $x\not \in {\mathfrak {i}}$ and $y\not \in {\mathfrak {i}}$ such that $xy\in {\mathfrak {i}}$ . As in the proof of Theorem A.3, we can write: ${\mathfrak {i}}={\mathfrak {j}}({\mathfrak {i}}+x)$ where ${\mathfrak {j}}$ is the set of all $a\in A$ such that $ax\in {\mathfrak {i}}$ . By maximality, ${\mathfrak {j}},{\mathfrak {i}}+x\not \in S$ . Thus, they are finite intersections of primary ideals, but then so is ${\mathfrak {i}}$ , contradiction. $\square$

Proposition.

If

(0)

is indecomposable, then the set of zero divisors is a union of minimal primes.

Integral extension

Let $A\subset B$ be rings. If $b\in B$ is a root of a monic polynomial $f\in A[X]$ , then $b$ is said to be integral over $A$ . If every element of $B$ is integral over $A$ , then we say $B$ is integral over $A$ or $B$ is an integral extension of $A$ . More generally, we say a ring morphism $f:A\to B$ is integral if the image of $A$ is integral over $B$ . By replacing $A$ with $f(A)$ , it suffices to study the case $A\subset B$ , and that's what we will below do.

Lemma A.9.

Let

b\in B

. Then the following are equivalent.

$b$ is integral over $A$ .
$A[b]$ is finite over $A$ .
$A[b]$ is contained in an $A$ -submodule of $B$ that is finite over $A$ .

Proof. (1) means that we can write:

b^{n+r}=-(b^{r+n-1}a_{n-1}+...b^{r+1}a_{1}+b^{r}a_{0})

Thus, $1,b,...,b^{n-1}$ spans $A[b]$ . Hence, (1) $\Rightarrow$ (2). Since (2) $\Rightarrow$ (3) vacuously, it remains to show (3) $\Rightarrow$ (1). Let $M_{/A[b]}$ be generated over $A$ by $x_{1},...,x_{n}$ . Since $bx_{i}\in M$ , we can write

bx_{i}=\sum _{j=1}^{n}c_{ij}x_{j}

where $c_{kj}\in A$ . Denoting by $C$ the matrix $c_{ij}$ , this means that $\det(bI-C)$ annihilates $M$ . Hence, $\det(bI-C)=0$ by (3). Noting $\det(bI-C)$ is a monic polynomial in $b$ we get (1). $\square$

The set of all elements in B that are integral over A is called the integral closure of A in B. By the lemma, the integral closure is a subring of $B$ containing $A$ . (Proof: if $x$ and $y$ are integral elements, then $A[xy]$ and $A[x-y]$ are contained in $A[x,y]$ , finite over $A$ .) It is also clear that integrability is transitive; that is, if $C$ is integral over $B$ and $B$ is integral over $A$ , then $C$ is integral over $A$ .

Proposition.

Let

f:A\to B

be an integral extension where

B

is a domain. Then

(i) $A$ is a field if and only if $B$ is a field.
(ii) Every nonzero ideal of $B$ has nonzero intersection with $A$ .

Proof. (i) Suppose $B$ is a field, and let $x\in A$ . Since $x^{-1}\in B$ and is integral over $A$ , we can write:

x^{-n}=-(a_{n-1}x^{-(n-1)}+...+a_{1}x^{-1}+a_{0})

Multiplying both sides by $x^{n-1}$ we see $x^{-1}\in A$ . For the rest, let $0\neq b\in B$ . We have an integral equation:

-a_{0}=b^{n}+a_{n-1}b^{n-1}+...+a_{1}b=b(b^{n-1}+a_{n-1}b^{n-2}+...+a_{1})

.

Since $B$ is a domain, if $n$ is the minimal degree of a monic polynomial that annihilates $b$ , then it must be that $a_{0}\neq 0$ . This shows that $bB\cap A\neq 0$ , giving us (ii). Also, if $A$ is a field, then $a_{0}$ is invertible and so is $b$ . $\square$

Theorem (Noether normalization).

Let

A

be a finitely generated

k

-algebra. Then we can find

z_{1},...,z_{d}

such that

$A$ is integral over $k[z_{1},...,z_{d}]$ .
$z_{1},...,z_{d}$ are algebraically independent over $k$ .
$z_{1},...,z_{d}$ are a separating transcendence basis of the field of fractions $K$ of $A$ if $K$ is separable over $k$ .

Exercise A.10 (Artin-Tate).

Let

A\subset B\subset C

be rings. Suppose

A

is noetherian. If

C

is finitely generated as an

A

-algebra and integral over

B

, then

B

is finitely generated as an

A

-algebra.

Exercise.

A ring morphism

f:A\to \Omega

(where

\Omega

is an algebraically closed field) extends to

F:A[b]\to \Omega

(Answer: http://www.math.uiuc.edu/~r-ash/ComAlg/)

Noetherian rings

Exercise.

A ring is noetherian if and only if every prime ideal is finitely generated. (See T. Y. Lam and Manuel L. Reyes, A Prime Ideal Principle in Commutative Algebra for a systematic study of results of this type.)

The next theorem furnishes many examples of a noetherian ring.

Theorem A.7 (Hilbert basis).

A

is a noetherian ring if and only if

A[T_{1},...T_{n}]

is noetherian.

Proof. By induction it suffices to prove $A[T]$ is noetherian. Let $I\triangleleft A[T]$ . Let $L_{n}$ be the set of all coefficients of polynomials of degree $\leq n$ in $I$ . Since $L_{n}\triangleleft A$ , there exists $d$ such that

L_{0}\subset L_{1}\subset L_{2},...,\subset L_{d}=L_{d+1}=...

.

For each $0\leq n\leq d$ , choose finitely many elements $f_{1n},f_{2n},...f_{m_{n}n}$ of $I$ whose coefficients $b_{1n},...b_{m_{n}n}$ generate $L_{n}$ . Let $I'$ be an ideal generated by $f_{jn}$ for all $j,n$ . We claim $I=I'$ . It is clear that $I\subset I'$ . We prove the opposite inclusion by induction on the degree of polynomials in $I$ . Let $f\in I$ , $a$ the leading coefficient of $f$ and $n$ the degree of $f$ . Then $a\in L_{n}$ . If $n\leq d$ , then

a=a_{1}b_{1n}+a_{2}b_{2n}+...+a_{m_{n}}b_{{m_{n}}n}

In particular, if $g=a_{1}f_{1n}+a_{2}f_{2n}+...+a_{m_{n}}f_{{m_{n}}n}$ , then $f-g$ has degree strictly less than that of $f$ and so by the inductive hypothesis $f-g\in I'$ . Since $g\in I'$ , $f\in I'$ then. If $n\geq d$ , then $a\in L_{d}$ and the same argument shows $f\in I'$ . $\square$

Exercise.

Let

A

be the ring of continuous functions

f:[0,1]\to [0,1]

.

A

is not noetherian.

Let $(A,{\mathfrak {m}})$ be a noetherian local ring with $k=A/{\mathfrak {m}}$ . Let ${\mathfrak {i}}\triangleleft A$ . Then ${\mathfrak {i}}$ is called an ideal of definition if $A/{\mathfrak {i}}$ is artinian.

Theorem.

\dim _{k}({\mathfrak {m}}/{\mathfrak {m}}^{2})\geq \dim A

The local ring $A$ is said to be regular if the equality holds in the above.

Theorem.

Let

A

be a noetherian ring. Then

\dim A[T_{1},...,T_{n}]=n+\dim A

.

Proof. By induction, it suffices to prove the case $n=1$ . $\square$

Theorem.

Let

A

be a finite-dimensional

k

-algebra. If

A

is a domain with the field of fractions

K

, then

\dim A=\operatorname {trdeg} _{k}K

.

Proof. By the noether normalization lemma, $A$ is integral over $k[x_{1},...,x_{n}]$ where $x_{1},...,x_{n}$ are algebraically independent over $k$ . Thus, $\dim A=\dim k[x_{1},...,x_{n}]=n$ . On the other hand, $\operatorname {trdeg} _{k}K=n$ . $\square$

Theorem.

Let

A

be a domain with (ACCP). Then

A

is a UFD if and only if every prime ideal

{\mathfrak {p}}

of height 1 is principal.

Proof. ( $\Rightarrow$ ) By Theorem A.10, ${\mathfrak {p}}$ contains a prime element $x$ . Then

0\subset (x)\subset {\mathfrak {p}}

where the second inclusion must be equality since ${\mathfrak {p}}$ has height 1. ( $\Leftarrow$ ) In light of Theorem A.10, it suffices to show that $A$ is a GCD domain. (TODO: complete the proof.) $\square$

Theorem.

A regular local ring is a UFD.

Theorem A.10 (Krull's intersection theorem).

Let

{\mathfrak {i}}\triangleleft A

be a proper ideal. If

A

is either a noetherian domain or a local ring, then

\bigcap _{n\geq 1}{\mathfrak {i}}^{n}=0

.

Theorem A.15.

Let

{\mathfrak {i}}\triangleleft A

. If

A

is noetherian,

{\sqrt {\mathfrak {i}}}^{n}\subset {\mathfrak {i}}\subset {\sqrt {\mathfrak {i}}}

for some

n

.

In particular, the nilradical of

A

is nilpotent.

Proof. It suffices to prove this when ${\mathfrak {i}}=0$ . Thus, the proof reduces to proving that the nilradical of A is nilpotent. Since $A$ is nilpotent, we have finitely many nilpotent elements $x_{1},...,x_{n}$ that spans ${\sqrt {(0)}}$ . The power of any linear combination of them is then a sum of terms that contain the high power of some $x_{j}$ if we take the sufficiently high power. Thus, ${\sqrt {(0)}}$ is nilpotent. $\square$

Proposition A.8.

If

A

is noetherian, then

{\hat {A}}

is noetherian.

Corollary.

If

A

is noetherian, then

A[[X]]

is noetherian.

Zariski topology

Given ${\mathfrak {a}}\triangleleft A$ , let $\operatorname {V} ({\mathfrak {a}})=\{{\mathfrak {p}}\in \operatorname {Spec} (A)|{\mathfrak {p}}\supset {\mathfrak {a}}\}$ . (Note that $\operatorname {V} ({\mathfrak {a}})=\operatorname {V} ({\sqrt {\mathfrak {a}}})$ .) It is easy to see

V({\mathfrak {a}})\cup V({\mathfrak {b}})=V({\mathfrak {a}}{\mathfrak {b}})=V({\mathfrak {a}}\cap {\mathfrak {b}})

, and

\cap _{\alpha }V({\mathfrak {a}}_{\alpha })=V(({\mathfrak {a}}_{\alpha }|\alpha ))

.

It follows that the collection of the sets of the form $\operatorname {V} ({\mathfrak {a}})$ includes the empty set and $\operatorname {Spec} (A)$ and is closed under intersection and finite union. In other words, we can define a topology for $\operatorname {Spec} (A)$ by declaring $\operatorname {Z} ({\mathfrak {i}})$ to be closed sets. The resulting topology is called the Zariski topology. Let $X=\operatorname {Spec} (A)$ , and write $X_{f}=X\backslash V((f))=\{P\in X|P\ni f\}$ .

Proposition A.16.

We have:

(i) $X_{f}$ is quasi-compact.
(ii) $X_{fg}$ is canonically isomorphic to $\operatorname {Spec} (A[f^{-1}])_{g}$ .

Proof. We have: $X_{f}\subset \bigcup _{\alpha }X_{f_{\alpha }}=X\backslash V((f_{\alpha }|\alpha ))\Leftrightarrow (f)\subset (f_{\alpha }|\alpha )\Leftrightarrow f\in (f_{\alpha _{1}},...,f_{\alpha _{n}})$ . $\square$

Exercise.

Let

A

be a local ring. Then

\operatorname {Spec} (A)

is connected.

Corollary.

\operatorname {Spec} (B)\to \operatorname {Spec} (A)

is a closed surjection.

Theorem A.12.

If

A_{m}

is noetherian for every maximal ideal

{\mathfrak {m}}

and if

\{{\mathfrak {m}}\in \operatorname {Max} (A)|x\in m\}

is finite for each

x\in A

, then

A

is noetherian.

Integrally closed domain

Lemma A.8.

In a GCD domain, if

(x,y)=1=(x,z)

, then

(x,yz)=1

.

Proposition A.9.

In a GCD domain, every irreducible element is prime.

Proof. Let $x$ be an irreducible, and suppose $x|yz$ . Then $x|(x,yz)$ . If $(x,yz)=1$ , $x$ is a unit, the case we tacitly ignore. Thus, by the lemma, $d=(x,y)$ , say, is a nonunit. Since $x$ is irreducible, $x|d$ and so $x|y$ . $\square$

In particular, in a polynomial ring that is a GCD domain, every irreducible polynomial is a prime element.

Theorem (undefined: ACC).

Let A be a ring that satisfies the ascending chain conditions on principal ideals (example: noetherian ring). Then every x in

A

is a finite product of irreducibles.

Theorem A.10.

Let

A

be a domain. The following are equivalent.

Every nonzero nonunit element is a finite product of prime elements.
(Kaplansky) Every nonzero prime ideal contains a prime element.
$A$ is a GCD domain and has (ACC) on principal ideals.

Proof. (3) $\Rightarrow$ (2): Let ${\mathfrak {p}}\in \operatorname {Spec} (A)$ . If ${\mathfrak {p}}$ is nonzero, it then contains a nonzero element x, which we factor into irreducibles: $x=p_{1}...p_{n}$ . Then $p_{j}\in {\mathfrak {p}}$ for some $j$ . Finally, irreducibles are prime since $A$ is a GCD domain. (2) $\Rightarrow$ (1): Let $S$ be the set of all products of prime elements. Clearly, $S$ satisfies the hypothesis of Theorem A.11 (i.e., closed under multiplication). Suppose, on the contrary, there is a nonzero nonunit $x$ . It is easy to see that since $x\not \in S$ , $(x)$ and $S$ are disjoint. Thus, by Theorem A.11, there is a prime ideal ${\mathfrak {p}}$ containing $x$ and disjoint from $S$ . But, by (2), ${\mathfrak {p}}$ contains a prime element $y$ ; that is, ${\mathfrak {p}}$ intersects $S$ , contradiction. (1) $\Rightarrow$ (3): By uniqueness of factorization, it is clear that $A$ is a GCD domain. $\square$

A domain satisfying the equivalent conditions in the theorem is called a unique factorization domain or a UFD for short.

Corollary.

If

A

is a UFD, then

A[X]

is a UFD. If A is a principal ideal domain, then

A[[X]]

is a UFD.

Theorem A.13 (Nagata criterion).

Let A be a domain, and

S\subset A

a multiplicatively closed subset generated by prime elements. Then

A

is a UFD if and only if

S^{-1}A

is a UFD.

Field theory

Basic definitions

Let $L/k$ be a field extension; i.e., $k$ is a subfield of a field $L$ . Then $L$ has a k-algebra structure; in particular, a vector space structure. A transcendental element is an element that is not integral; in other words, $x$ is transcendental over $k$ if and only if $k[x]$ is (isomorphic to) the polynomial ring in one variable. The situation can be phrased more abstract as follows. Given an element x in an extension $L/k$ and an indeterminate $t$ , we have the exact sequence:

0\to {\mathfrak {p}}\to k[t]\to k[x]\to 0

by letting $x\mapsto t$ and ${\mathfrak {p}}$ the kernel of that map. Thus, $x$ is transcendental over $k$ if and only if ${\mathfrak {p}}=0$ . Since $k[t]$ is a PID, when nonzero, ${\mathfrak {p}}$ is generated by a nonzero polynomial called the minimal polynomial of $x$ , which must be irreducible since $k[x]$ is a domain and so ${\mathfrak {p}}$ is prime. (Note that if we replace $k[t]$ by $k[t,s]$ , say, then it is no longer a PID; therefore the kernel is no longer principal. So, in general, if a subset $S\subset L$ is such that $k[S]$ is a polynomial ring where members of $S$ are variables, then $S$ is said to be algebraically independent; By convention, the empty set is algebraically independent, just as it is linearly independent.) Finally, as a custom, we call an integral field extension an algebraic extension.

When $L$ has finite dimension over $k$ , the extension is called finite extension. Every finite extension is algebraic. Indeed, if $x\in L$ is transcendental over $k$ , then $k[x]$ is a "polynomial ring" and therefore is an infinite-dimensional subspace of $L$ and L must be infinite-dimensional as well.

Exercise.

A complex number is called an algebraic number if it is integral over

\mathbf {Q}

. The set of all algebraic numbers is countable.

A field is called algebraically closed if it admits no nontrivial algebraic field extension. (A field is always an algebraic extension of itself, a trivial extension.) More concretely, a field is algebraically closed if every root of a polynomial over that field is already in that field. It follows from the Axiom of Choice (actually equivalent to it) that every field is a subfield of some algebraically closed field.

Separable extensions

A field extension $L/k$ is said to be separable if it is separable as k-algebra; i.e., $L\otimes _{k}F$ is reduced for all field extension $E/k$ . The next theorem assures that this is equivalent to the classical definition.

Theorem.

A field

L

is a separable algebraic over

k

if and only if every irreducible polynomial has distinct roots (i.e.,

f

and its derivative

f'

have no common root.)

For the remainder of the section, $p$ denotes the characteristic exponent of a field; (i.e., $p=1$ if $\operatorname {char} (k)=0$ and $p=\operatorname {char} (k)$ otherwise.) If the injection

x\mapsto x^{p}:k\to k

is actually surjective (therefore, an automorphism), then a field is called perfect. Examples: Fields of characteristic zero and finite fields are perfect. Imperfect fields are therefore rather rare; they appear in algebraic geometry, a topic in later chapters. We let $k_{p}$ be the union of $k$ adjoined with $p^{e}$ -th roots of elements in $k$ over all positive integers $e$ . $k_{p}$ is then called the perfect closure since there is no strictly smaller subfield of $k_{p}$ that is perfect.

Proposition.

A

k

-algebra

A

is separable if and only if

A\otimes _{k}k_{p}

is reduced.

Proposition.

The following are equivalent.

(i) A field is perfect.
(ii) Every finite extension is separable.
(iii) Every extension is separable.

Proof. Suppose (ii) is false; it is then necessary that $p>1$ and . Finally, if (iii) is false, then there is an extension $L/k$ such that $L\otimes _{k}k_{p}$ is not reduced. Since $k_{p}$ is algebraic over $k$ by construction, it has a finite extension $F$ such that $L\otimes _{k}F$ is not reduced. This $F$ falsifies (ii). $\square$

In particular, any extension of a perfect field is perfect.

Separable extensions

Let $L/K$ be a field extension, and $p$ be the characteristic exponent of $K$ (i.e., $p=1$ if $K$ has characteristic zero; otherwise, $p=\operatorname {char} k$ .) $L$ is said to be separable over $K$ if $L\otimes _{K}K^{p^{-1}}$ is a domain. A maximal separable extension $k$ is called the separable closure and denoted by $k_{s}$ .

A field is said to be perfect if its separable closure is algebraically closed. A field is said to be purely inseparable if it equals its separable closure. (As the reader would notice, the terminology so far is quite confusing; but it is historical.)

Lemma.

An algebraic extension is separable if and only if the minimal polynomial of any element has no multiple root.

Proof. We may assume that the extension is finite. $\square$

Proposition.

A field is perfect if and only if either (i) its characteristic is zero or (ii)

x\mapsto x^{p}

is an automorphism of

K

Proof. First suppose $p=0$ . Let $f$ be an irreducible polynomial. If $f$ and $f'$ have a common root, then, since $f$ is irreducible, $f$ must divide $f'$ and so $f'=0$ since $\deg f'<\deg f$ . On the other hand, if $f(t)=a_{0}+a_{1}t+...+a_{n}t^{n}$ , then

f'(t)=a_{1}+2a_{2}t+...+na_{n}t^{n-1}\neq 0

.

Thus, a field of characteristic is perfect. $\square$

Corollary.

A finite field is perfect.

Proposition.

Let

L/K

be a finite extension. Then

L

is separable over

K

if and only if

L

is separable over

F

and

F

is separable over

K

.

Proposition.

Every finite field extension factors to a separable extension followed by a purely inseparable extension. More precisely,

Exercise.

(Clark p. 33) Let

k

be a field of characteristic 2,

F=k(x,y)

,

u\in F

a root of

t^{2}+t+x

,

S=F(u)

and

K=S({\sqrt {uy}})

. Then (i)

K/S

is purely inseparable and

S/F

is separable. (ii) There is no nontrivial purely inseparable subextension of K/F.

Theorem (Primitive element).

Let

L=K[x_{1},...,x_{n}]

be a finite extension, where

x_{2},...,x_{n}

(but not necessarily

x_{1}

) are separable over

K

. Then

L=K[z]

for some

z\in L

.

Proof. It suffices to prove the case $n=2$ (TODO: why?) Let $\mu _{i}$ be the minimal polynomials of $x_{i}$ . $\square$

Theorem.

Let

L/K

be a finitely generated field extension. Then the following are equivalent.

$L$ is separable over $K$ .
$L$ has a separating transcendence basis over $K$ .
$L\otimes _{K}K^{p^{-r}}$ is a domain.

Transcendental extensions

Theorem (undefined: Lüroth) (Lüroth).

Any subfield

E

of

F(X)

containing

F

but not equal to

F

is a pure transcendental extension of

F

.

Let $L\supset K$ be a field extension of degree $n<\infty$ . An element $x\in L$ defines a $K$ -linear map:

x_{L}:L\to L,y\mapsto xy

.

We define

$\operatorname {Tr} _{L/K}(x)=\operatorname {Tr} (x_{L}).$
$\operatorname {Nm} _{L/K}(x)=\operatorname {det} (x_{L}).$

Proposition.

Let

L\supset K\supset F

be finite field extensions. Then

(i) $\operatorname {Tr} _{L/F}=\operatorname {Tr} _{L/K}\circ \operatorname {Tr} _{K/F}.$
(ii) $\operatorname {Nm} _{L/F}=\operatorname {Nm} _{L/K}\circ \operatorname {Nm} _{K/F}.$

Theorem A.8 (Hilbert 90).

If

L/K

is a finite Galois extension, then

\operatorname {H} ^{1}(\operatorname {Gal} (L/K),L^{\times })=0

.

Corollary.

Let

L/K

is a cyclic extension, and

\sigma

generate

\operatorname {Gal} (L/K)

. If

a\in L

such that

\operatorname {Nm} _{L/K}(a)=1

, then

a={\sigma (b) \over b}

for some

b

.

A. Theorem A ﬁeld extension $L/K$ is algebraic if and only if it is the direct limit of its ﬁnite subextensions.

A field extension $K/F$ is said to be Galois if

K^{\operatorname {Aut} (K/F)}=F.

Here, we used the notation of invariance:

K^{G}=\{x\in K|\sigma (x)=x,\forall \sigma \in G\}

(In particular, when $K/F$ is a finite extension, $K/F$ is a Galois extension if and only if $|\operatorname {Aut} (K/F)|=[K:F]$ .) When $K/F$ is Galois, we set $\operatorname {Gal} (K/F)=\operatorname {Aut} (K/F)$ , and call $\operatorname {Gal} (K/F)$ the Galois group of $K/F$ .

A. Theorem A field extension $K/F$ is Galois if and only if it is normal and separable.

Integrally closed domain

A domain is said to be integrally closed if $A$ equals the integral closure of $A$ in the field of fractions.

Proposition.

GCD domains and valuation domains are integrally closed.

Proof. Suppose $r/s$ is integral over $A$ ; i.e.,

(r/s)^{n}+a_{n-1}(r/s)^{n-1}+...+a_{1}(r/s)+a_{0}=0

.

We may assume $(r,s)=1$ . It follows:

r^{n}=-a_{n-1}rs+...+a_{1}rs^{n-1}+a_{0}s^{n}

.

and so $s|r^{n}$ . Since $(r^{n},s)=1$ by Lemma A.8, we have that $s$ is a unit in $A$ , and thus $r/s\in A$ . The case of valuation domains is very similar. $\square$

Proposition.

"integrally closed" is a local property.

Proposition.

Let

A

be a domain. The following are equivalent.

Every finitely generated submodule of a projective $A$ -module is projective.
Every finitely generated nonzero ideal of $A$ is invertible.
$A_{\mathfrak {p}}$ is a valuation domain for every prime ideal ${\mathfrak {p}}\triangleleft A$ .
Every overring of $A$ is the intersection of localizations of $A$ .
Every overring of $A$ is integrally closed.

A domain satisfying any/all of the equivalent conditions in the proposition is called the Prüfer domain. A notherian Prüfer domain is called a Dedekind domain.

Proposition A.10.

Let

A

be an integrally closed domain, and

L

a finite extension of

A_{(0)}

. Then

x\in L

is integral over

A

if and only if its minimal polynomial in

K[X]

is in

A[X]

.

A Dedekind domain is a domain whose proper ideals are products of prime ideals.

A. Theorem Every UFD that is a Dedekind domain is a principal ideal domain.
Proof: Let ${\mathfrak {p}}$ be a prime ideal. We may assume ${\mathfrak {p}}$ is nonzero; thus, it contains a nonzero element $x$ . We may assume that $x$ is irreducible; thus, prime by unique factorization. If ${\mathfrak {p}}$ is prime, then we have $(x)={\mathfrak {p}}$ . Thus, every prime ideal is principal. $\square$

Theorem Let A be an integral domain. Then A is a Dedekind domain if and only if:

(i) A is integrally closed.
(ii) A is noetherian, and
(iii) Every prime ideal is maximal.

A. Theorem Let A be a Dedekind domain with fraction field K. Let L be a finite degree field extension of K and denote by S the integral closure of R in L. Then S is itself a Dedekind domain.

A Lemma Let $A$ be an integral domain. Then $A$ is a Dedekind domain if and only if every localization of $A$ is a discrete valuation ring.

Lemma Let $A$ be a noetherian ring. Then every ideal contains a product of nonzero prime ideals.
Proof: Let $S$ be the set of all ideals that do not contain a product of nonzero prime ideals. If the lemma is false, $S$ is nonempty. Since $A$ is noetherian, $S$ has a maximal element ${\mathfrak {i}}$ . Note that ${\mathfrak {i}}$ is not prime; thus, there are $a,b$ such that $ab\in {\mathfrak {i}}$ but $a\not \in {\mathfrak {a}}$ and $b\not \in {\mathfrak {i}}$ . Now, $({\mathfrak {i}}+(a))({\mathfrak {i}}+(b))\subset {\mathfrak {i}}$ . Since both ${\mathfrak {i}}+(a)$ and ${\mathfrak {i}}+(b)$ are strictly larger than ${\mathfrak {i}}$ , which is maximal in $S$ , ${\mathfrak {i}}+(a)$ and ${\mathfrak {i}}+(b)$ are both not in $S$ and both contain products of prime ideals. Hence, ${\mathfrak {i}}$ contains a product of prime ideals. $\square$

A local principal ideal domain is called a discrete valuation ring. A typical example is a localization of a Dedekind domain.

Henselian rings

References

Pete L. Clark. Commutative Algebra
Pete L. Clark. Field Theory
J.S. Milne. A Primer of Commutative Algebra
J.S. Milne. Fields and Galois Theory
Matsumura, Commutative ring theory

Linear algebra

The Moore-Penrose inverse

Inverse matrices play a key role in linear algebra, and particularly in computations. However, only square matrices can possibly be invertible. This leads us to introduce the Moore-Penrose inverse of a potentially non-square real- or complex-valued matrix, which satisfies some but not necessarily all of the properties of an inverse matrix.

Definition.

Let

A

be an m-by-n matrix over a field

\mathbb {K}

and

A^{+}

be an n-by-m matrix over

\mathbb {K}

, where

\mathbb {K}

is either

\mathbb {R}

, the real numbers, or

\mathbb {C}

, the complex numbers. Recall that

A^{*}

refers to the conjugate transpose of

A

. Then the following four criteria are called the Moore–Penrose conditions for $A$ :

$AA^{+}A=A$ ,
$A^{+}AA^{+}=A^{+}$ ,
$\left(AA^{+}\right)^{*}=AA^{+}$ ,
$\left(A^{+}A\right)^{*}=A^{+}A$ .

We will see below that given a matrix $A$ , there exists a unique matrix $A^{+}$ that satisfies all four of the Moore–Penrose conditions. They generalise the properties of the usual inverse.

Remark.

If

A

is an invertible square matrix, then the ordinary inverse

A^{-1}

satisfies the Moore-Penrose conditions for

A

. Observe also that if

A^{+}

satisfies the Moore-Penrose conditions for

A

, then

A

satisfies the Moore-Penrose conditions for

A^{+}

.

Basic properties of the Hermitian conjugate

We assemble some basic properties of the conjugate transpose for later use. In the following lemmas, $A$ is a matrix with complex elements and n columns, $B$ is a matrix with complex elements and n rows.

Lemma (1).

For any

\mathbb {K}

-matrix

A

,

A^{*}A=0\Rightarrow A=0

Proof. The assumption says that all elements of A*A are zero. Therefore,

0=\operatorname {Tr} \left(A^{*}A\right)=\sum _{j=1}^{n}\left(A^{*}A\right)_{jj}=\sum _{j=1}^{n}\sum _{i=1}^{m}\left(A^{*}\right)_{ji}A_{ij}=\sum _{i=1}^{m}\sum _{j=1}^{n}\left|A_{ij}\right|^{2}.

Therefore, all $A_{ij}$ equal 0 i.e. $A=0$ . $\square$

Lemma (2).

For any

\mathbb {K}

-matrix

A

,

A^{*}AB=0\Rightarrow AB=0

Proof. : ${\begin{aligned}0&=A^{*}AB&\\\Rightarrow 0&=B^{*}A^{*}AB&\\\Rightarrow 0&=(AB)^{*}(AB)&\\\Rightarrow 0&=AB&({\text{by Lemma 1}})\end{aligned}}$ $\square$

Lemma (3).

For any

\mathbb {K}

-matrix

A

,

ABB^{*}=0\Rightarrow AB=0

Proof. This is proved in a manner similar to the argument of Lemma 2 (or by simply taking the Hermitian conjugate). $\square$

Existence and uniqueness

We establish existence and uniqueness of the Moore-Penrose inverse for every matrix.

Theorem.

If

A

is a

\mathbb {K}

-matrix and

A_{1}^{+}

and

A_{2}^{+}

satisfy the Moore-Penrose conditions for

A

, then

A_{1}^{+}=A_{2}^{+}

.

Proof. Let $A$ be a matrix over $\mathbb {R}$ or $\mathbb {C}$ . Suppose that ${A_{1}^{+}}$ and ${A_{2}^{+}}$ are Moore–Penrose inverses of $A$ . Observe then that

A{A_{1}^{+}}{\overset {(1)}{{}={}}}(A{A_{2}^{+}}A){A_{1}^{+}}=(A{A_{2}^{+}})(A{A_{1}^{+}}){\overset {(3)}{{}={}}}(A{A_{2}^{+}})^{*}(A{A_{1}^{+}})^{*}={A_{2}^{+}}^{*}(A{A_{1}^{+}}A)^{*}{\overset {(1)}{{}={}}}{A_{2}^{+}}^{*}A^{*}=(A{A_{2}^{+}})^{*}{\overset {(3)}{{}={}}}A{A_{2}^{+}}.

Analogously we conclude that ${A_{1}^{+}}A={A_{2}^{+}}A$ . The proof is completed by observing that then

{A_{1}^{+}}{\overset {(2)}{{}={}}}{A_{1}^{+}}A{A_{1}^{+}}={A_{1}^{+}}A{A_{2}^{+}}=A_{2}^{+}A{A_{2}^{+}}{\overset {(2)}{{}={}}}{A_{2}^{+}}.

\square

Theorem.

For every

\mathbb {K}

-matrix

A

there is a matrix

A^{+}

satisfying the Moore-Penrose conditions for

A

Proof. The proof proceeds in stages.

$A$ is a 1-by-1 matrix

For any $x\in \mathbb {K}$ , we define:

x^{+}:={\begin{cases}x^{-1},&{\mbox{if }}x\neq 0\\0,&{\mbox{if }}x=0\end{cases}}

It is easy to see that $x^{+}$ is a pseudoinverse of $x$ (interpreted as a 1-by-1 matrix).

$A$ is a square diagonal matrix

Let $D$ be an n-by-n matrix over $\mathbb {K}$ with zeros off the diagonal. We define $D^{+}$ as an n-by-n matrix over $\mathbb {K}$ with $\left(D^{+}\right)_{ij}:=\left(D_{ij}\right)^{+}$ as defined above. We write simply $D_{ij}^{+}$ for $\left(D^{+}\right)_{ij}=\left(D_{ij}\right)^{+}$ .

Notice that $D^{+}$ is also a matrix with zeros off the diagonal.

We now show that $D^{+}$ is a pseudoinverse of $D$ :

$\left(DD^{+}D\right)_{ij}=D_{ij}D_{ij}^{+}D_{ij}=D_{ij}\Rightarrow DD^{+}D=D$
$\left(D^{+}DD^{+}\right)_{ij}=D_{ij}^{+}D_{ij}D_{ij}^{+}=D_{ij}^{+}\Rightarrow D^{+}DD^{+}=D^{+}$
$\left(DD^{+}\right)_{ij}^{*}={\overline {\left(DD^{+}\right)_{ji}}}={\overline {D_{ji}D_{ji}^{+}}}=\left(D_{ji}D_{ji}^{+}\right)^{*}=D_{ji}D_{ji}^{+}=D_{ij}D_{ij}^{+}\Rightarrow \left(DD^{+}\right)^{*}=DD^{+}$
$\left(D^{+}D\right)_{ij}^{*}={\overline {\left(D^{+}D\right)_{ji}}}={\overline {D_{ji}^{+}D_{ji}}}=\left(D_{ji}^{+}D_{ji}\right)^{*}=D_{ji}^{+}D_{ji}=D_{ij}^{+}D_{ij}\Rightarrow \left(D^{+}D\right)^{*}=D^{+}D$

$A$ is a general diagonal matrix

Let $D$ be an m-by-n matrix over $\mathbb {K}$ with zeros off the main diagonal, where m and n are unequal. That is, $D_{ij}=d_{i}$ for some $d_{i}\in \mathbb {K}$ when $i=j$ and $D_{ij}=0$ otherwise.

Consider the case where $n>m$ . Then we can rewrite $D=\left[D_{0}\,\,\mathbf {0} _{m\times (n-m)}\right]$ by stacking where $D_{0}$ is a square diagonal m-by-m matrix, and $\mathbf {0} _{m\times (n-m)}$ is the m-by-(n−m) zero matrix. We define $D^{+}\equiv {\begin{bmatrix}D_{0}^{+}\\\mathbf {0} _{(n-m)\times m}\end{bmatrix}}$ as an n-by-m matrix over $\mathbb {K}$ , with $D_{0}^{+}$ the pseudoinverse of $D_{0}$ defined above, and $\mathbf {0} _{(n-m)\times m}$ the (n−m)-by-m zero matrix. We now show that $D^{+}$ is a pseudoinverse of $D$ :

By multiplication of block matrices, $DD^{+}=D_{0}D_{0}^{+}+\mathbf {0} _{m\times (n-m)}\mathbf {0} _{(n-m)\times m}=D_{0}D_{0}^{+},$ so by property 1 for square diagonal matrices $D_{0}$ proven in the previous section, $DD^{+}D=D_{0}D_{0}^{+}\left[D_{0}\,\,\mathbf {0} _{m\times (n-m)}\right]=\left[D_{0}D_{0}^{+}D_{0}\,\,\mathbf {0} _{m\times (n-m)}\right]=\left[D_{0}\,\,\mathbf {0} _{m\times (n-m)}\right]=D$ .
Similarly, $D^{+}D={\begin{bmatrix}D_{0}^{+}D_{0}&\mathbf {0} _{m\times (n-m)}\\\mathbf {0} _{(n-m)\times m}&\mathbf {0} _{(n-m)\times (n-m)}\end{bmatrix}}$ , so $D^{+}DD^{+}={\begin{bmatrix}D_{0}^{+}D_{0}&\mathbf {0} _{m\times (n-m)}\\\mathbf {0} _{(n-m)\times m}&\mathbf {0} _{(n-m)\times (n-m)}\end{bmatrix}}{\begin{bmatrix}D_{0}^{+}\\\mathbf {0} _{(n-m)\times m}\end{bmatrix}}={\begin{bmatrix}D_{0}^{+}D_{0}D_{0}^{+}\\\mathbf {0} _{(n-m)\times m}\end{bmatrix}}=D^{+}.$
By 1 and property 3 for square diagonal matrices, $\left(DD^{+}\right)^{*}=\left(D_{0}D_{0}^{+}\right)^{*}=D_{0}D_{0}^{+}=DD^{+}$ .
By 2 and property 4 for square diagonal matrices, $\left(D^{+}D\right)^{*}={\begin{bmatrix}\left(D_{0}^{+}D_{0}\right)^{*}&\mathbf {0} _{m\times (n-m)}\\\mathbf {0} _{(n-m)\times m}&\mathbf {0} _{(n-m)\times (n-m)}\end{bmatrix}}={\begin{bmatrix}D_{0}^{+}D_{0}&\mathbf {0} _{m\times (n-m)}\\\mathbf {0} _{(n-m)\times m}&\mathbf {0} _{(n-m)\times (n-m)}\end{bmatrix}}=D^{+}D.$

Existence for $D$ such that $m>n$ follows by swapping the roles of $D$ and $D^{+}$ in the $n>m$ case and using the fact that $\left(D^{+}\right)^{+}=D$ .

$A$ is an arbitrary matrix

The singular value decomposition theorem states that there exists a factorization of the form

A=U\Sigma V^{*}

where:

U

is an m-by-m unitary matrix over

\mathbb {K}

.

\Sigma

is an m-by-n matrix over

\mathbb {K}

with nonnegative real numbers on the diagonal and zeros off the diagonal.

V

is an n-by-n unitary matrix over

\mathbb {K}

.^[1]

Define $A^{+}$ as $V\Sigma ^{+}U^{*}$ .

We now show that $A^{+}$ is a pseudoinverse of $A$ :

$AA^{+}A=U\Sigma V^{*}V\Sigma ^{+}U^{*}U\Sigma V^{*}=U\Sigma \Sigma ^{+}\Sigma V^{*}=U\Sigma V^{*}=A$
$A^{+}AA^{+}=V\Sigma ^{+}U^{*}U\Sigma V^{*}V\Sigma ^{+}U^{*}=V\Sigma ^{+}\Sigma \Sigma ^{+}U^{*}=V\Sigma ^{+}U^{*}=A^{+}$
$\left(AA^{+}\right)^{*}=\left(U\Sigma V^{*}V\Sigma ^{+}U^{*}\right)^{*}=\left(U\Sigma \Sigma ^{+}U^{*}\right)^{*}=U\left(\Sigma \Sigma ^{+}\right)^{*}U^{*}=U\left(\Sigma \Sigma ^{+}\right)U^{*}=U\Sigma V^{*}V\Sigma ^{+}U^{*}=AA^{+}$
$\left(A^{+}A\right)^{*}=\left(V\Sigma ^{+}U^{*}U\Sigma V^{*}\right)^{*}=\left(V\Sigma ^{+}\Sigma V^{*}\right)^{*}=V\left(\Sigma ^{+}\Sigma \right)^{*}V^{*}=V\left(\Sigma ^{+}\Sigma \right)V^{*}=V\Sigma ^{+}U^{*}U\Sigma V^{*}=A^{+}A$ $\square$

This leads us to the natural definition:

Definition (Moore-Penrose inverse).

Let

A

be a

\mathbb {K}

-matrix. Then the unique

\mathbb {K}

-matrix satisfying the Moore-Penrose conditions for

A

is called the Moore-Penrose inverse

A^{+}

of

A

.

Basic properties

We have already seen above that the Moore-Penrose inverse generalises the classical inverse to potentially non-square matrices. We will now list some basic properties of its interaction with the Hermitian conjugate, leaving most of the proofs as exercises to the reader.

Exercise.

For any

\mathbb {K}

-matrix

A

,

{A^{*}}^{+}={A^{+}}^{*}

The following identities hold:

A⁺ = A⁺ A^+* A^*
A⁺ = A^* A^+* A⁺
A = A^+* A^* A
A = A A^* A^+*
A^* = A^* A A⁺
A^* = A⁺ A A^*

Proof of the first one: $A^{+}=A^{+}AA^{+}$ and $AA^{+}=\left(AA^{+}\right)^{*}$ imply that $A^{+}=A^{+}\left(AA^{+}\right)^{*}=A^{+}A^{+^{*}}A^{*}$ . □

The remaining identities are left as exercises.

Reduction to the Hermitian case

The results of this section show that the computation of the pseudoinverse is reducible to its construction in the Hermitian case. It suffices to show that the putative constructions satisfy the defining criteria.

Proposition.

For every

\mathbb {K}

-matrix

A

,

A^{+}=A^{*}(AA^{*})^{+}

Proof. Observe that

{\begin{aligned}&&AA^{*}&=AA^{*}\left(AA^{*}\right)^{+}AA^{*}&\\&\Leftrightarrow &AA^{*}&=ADAA^{*}&\\&\Leftrightarrow &0&=(AD-I)AA^{*}&\\&\Leftrightarrow &0&=ADA-A&({\text{by Lemma 3}})\\&\Leftrightarrow &A&=ADA&\end{aligned}}

Similarly, $\left(AA^{*}\right)^{+}AA^{*}\left(AA^{*}\right)^{+}=\left(AA^{*}\right)^{+}$ implies that $A^{*}\left(AA^{*}\right)^{+}AA^{*}\left(AA^{*}\right)^{+}=A^{*}\left(AA^{*}\right)^{+}$ i.e. $DAD=D$ .

Additionally, $AD=AA^{*}\left(AA^{*}\right)^{+}$ so $AD=(AD)^{*}$ .

Finally, $DA=A^{*}\left(AA^{*}\right)^{+}A$ implies that $(DA)^{*}=A^{*}\left(\left(AA^{*}\right)^{+}\right)^{*}A=A^{*}\left(\left(AA^{*}\right)^{+}\right)A=DA$ .

Therefore, $D=A^{+}$ . $\square$

Exercise.

For every

\mathbb {K}

-matrix

A

,

A^{+}=(A^{*}A)^{+}A^{*}

Products

We now turn to calculating the Moore-Penrose inverse for a product of two matrices, $C=AB.$

Proposition.

If

A

has orthonormal columns i.e.

A^{*}A=I

, then for any

\mathbb {K}

-matrix

B

of the right dimensions,

(AB)^{+}=B^{+}A^{+}

.

Proof. Since $A^{*}A=I$ , $A^{+}=A^{*}$ . Write $C=AB$ and $D=B^{+}A^{+}=B^{+}A^{*}$ . We show that $D$ satisfies the Moore–Penrose criteria for $C$ .

{\begin{aligned}CDC&=ABB^{+}A^{*}AB=ABB^{+}B=AB=C,\\[4pt]DCD&=B^{+}A^{*}ABB^{+}A^{*}=B^{+}BB^{+}A^{*}=B^{+}A^{*}=D,\\[4pt](CD)^{*}&=D^{*}B^{*}A^{*}=A\left(B^{+}\right)^{*}B^{*}A^{*}=A\left(BB^{+}\right)^{*}A^{*}=ABB^{+}A^{*}=CD,\\[4pt](DC)^{*}&=B^{*}A^{*}D^{*}=B^{*}A^{*}A\left(B^{+}\right)^{*}=\left(B^{+}B\right)^{*}=B^{+}B=B^{+}A^{*}AB=DC.\end{aligned}}

Therefore, $D=C^{+}$ . $\square$

Exercise.

If

B

has orthonormal rows, then for any

\mathbb {K}

-matrix

A

of the right dimensions,

(AB)^{+}=B^{+}A^{+}

.

Another important special case which approximates closely that of invertible matrices is when $A$ has full column rank and $B$ has full row rank.

Proposition.

If

A

has full column rank and

B

has full row rank, then

(AB)^{+}=B^{+}A^{+}

.

Proof. Since $A$ has full column rank, $A^{*}A$ is invertible so $\left(A^{*}A\right)^{+}=\left(A^{*}A\right)^{-1}$ . Similarly, since $B$ has full row rank, $BB^{*}$ is invertible so $\left(BB^{*}\right)^{+}=\left(BB^{*}\right)^{-1}$ .

Write $D=B^{+}A^{+}=B^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}$ (using reduction to the Hermitian case). We show that $D$ satisfies the Moore–Penrose criteria.

{\begin{aligned}CDC&=ABB^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}AB=AB=C,\\[4pt]DCD&=B^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}ABB^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}=B^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}=D,\\[4pt]CD&=ABB^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}=A\left(A^{*}A\right)^{-1}A^{*}=\left(A\left(A^{*}A\right)^{-1}A^{*}\right)^{*},\\\Rightarrow (CD)^{*}&=CD,\\[4pt]DC&=B^{*}\left(BB^{*}\right)^{-1}\left(A^{*}A\right)^{-1}A^{*}AB=B^{*}\left(BB^{*}\right)^{-1}B=\left(B^{*}\left(BB^{*}\right)^{-1}B\right)^{*},\\\Rightarrow (DC)^{*}&=DC.\end{aligned}}

Therefore, $D=C^{+}$ . $\square$

We finally derive a formula for calculating the Moore-Penrose inverse of $AA^{*}$ .

Proposition.

If

B=A^{*}

, then

(AB)^{+}=A^{+*}A^{+}

.

Proof. Here, $B=A^{*}$ , and thus $C=AA^{*}$ and $D=A^{+*}A^{+}$ . We show that indeed $D$ satisfies the four Moore–Penrose criteria.

{\begin{aligned}CDC&=AA^{*}A^{+*}A^{+}AA^{*}=A\left(A^{+}A\right)^{*}A^{+}AA^{*}=AA^{+}AA^{+}AA^{*}=AA^{+}AA^{*}=AA^{*}=C\\[4pt]DCD&=A^{+*}A^{+}AA^{*}A^{+*}A^{+}=A^{+*}A^{+}A\left(A^{+}A\right)^{*}A^{+}=A^{+*}A^{+}AA^{+}AA^{+}=A^{+*}A^{+}AA^{+}=A^{+*}A^{+}=D\\[4pt](CD)^{*}&=\left(AA^{*}A^{+*}A^{+}\right)^{*}=A^{+*}A^{+}AA^{*}=A^{+*}\left(A^{+}A\right)^{*}A^{*}=A^{+*}A^{*}A^{+*}A^{*}\\&=\left(AA^{+}\right)^{*}\left(AA^{+}\right)^{*}=AA^{+}AA^{+}=A\left(A^{+}A\right)^{*}A^{+}=AA^{*}A^{+*}A^{+}=CD\\[4pt](DC)^{*}&=\left(A^{+*}A^{+}AA^{*}\right)^{*}=AA^{*}A^{+*}A^{+}=A\left(A^{+}A\right)^{*}A^{+}=AA^{+}AA^{+}\\&=\left(AA^{+}\right)^{*}\left(AA^{+}\right)^{*}=A^{+*}A^{*}A^{+*}A^{*}=A^{+*}\left(A^{+}A\right)^{*}A^{*}=A^{+*}A^{+}AA^{*}=DC\end{aligned}}

Therefore, $D=C^{+}$ . In other words:

\left(AA^{*}\right)^{+}=A^{+*}A^{+}

and, since $\left(A^{*}\right)^{*}=A$

\left(A^{*}A\right)^{+}=A^{+}A^{+*}

\square

Projectors and subspaces

The defining feature of classical inverses is that $AA^{-1}=A^{-1}A=I.$ What can we say about $AA^{+}$ and $A^{+}A$ ?

We can derive some properties easily from the more basic properties above:

Exercise.

Let

A

be a

\mathbb {K}

-matrix. Then

(AA^{+})^{2}=(AA^{+}),(A^{+}A)^{2}=(A^{+}A),(AA^{+})^{*}=(AA^{+})

and

(A^{+}A)^{*}=(A^{+}A)

We can conclude that $P=AA^{+}$ and $Q=A^{+}A$ are orthogonal projections.

Proposition.

Let

A

be a

\mathbb {K}

-matrix. Then

P=AA^{+}

and

Q=A^{+}A

are orthogonal projections

Proof. Indeed, consider the operator $P$ : any vector decomposes as

x=Px+(I-P)x

and for all vectors $x$ and $y$ satisfying $Px=x$ and $(I-P)y=y$ , we have

x^{*}y=(Px)^{*}(I-P)y=x^{*}P^{*}(I-P)y=x^{*}P(I-P)y=0

.

It follows that $PA=AA^{+}A=A$ and $A^{+}P=A^{+}AA^{+}=A^{+}$ . Similarly, $QA^{+}=A^{+}$ and $AQ=A$ . The orthogonal components are now readily identified. $\square$

We finish our analysis by determining image and kernel of the mappings encoded by the Moore-Penrose inverse.

Proposition.

Let

A

be a

\mathbb {K}

-matrix. Then

\operatorname {Ker} \left(A^{+}\right)=\operatorname {Ker} \left(A^{*}\right)

and

\operatorname {Im} \left(A^{+}\right)=\operatorname {Im} \left(A^{*}\right)

.

Proof. If $y$ belongs to the range of $A$ then for some $x$ , $y=Ax$ and $Py=PAx=Ax=y$ . Conversely, if $Py=y$ then $y=AA^{+}y$ so that $y$ belongs to the range of $A$ . It follows that $P$ is the orthogonal projector onto the range of $A$ . $I-P$ is then the orthogonal projector onto the orthogonal complement of the range of $A$ , which equals the kernel of $A^{*}$ .

A similar argument using the relation $QA^{*}=A^{*}$ establishes that $Q$ is the orthogonal projector onto the range of $A^{*}$ and $(I-Q)$ is the orthogonal projector onto the kernel of $A$ .

Using the relations $P\left(A^{+}\right)^{*}=P^{*}\left(A^{+}\right)^{*}=\left(A^{+}P\right)^{*}=\left(A^{+}\right)^{*}$ and $P=P^{*}=\left(A^{+}\right)^{*}A^{*}$ it follows that the range of P equals the range of $\left(A^{+}\right)^{*}$ , which in turn implies that the range of $I-P$ equals the kernel of $A^{+}$ . Similarly $QA^{+}=A^{+}$ implies that the range of $Q$ equals the range of $A^{+}$ . Therefore, we find,

{\begin{aligned}\operatorname {Ker} \left(A^{+}\right)&=\operatorname {Ker} \left(A^{*}\right).\\\operatorname {Im} \left(A^{+}\right)&=\operatorname {Im} \left(A^{*}\right).\\\end{aligned}}

\square

Applications

We present two applications of the Moore-Penrose inverse in solving linear systems of equations.

Least-squares minimization

Moore-Penrose inverses can be used for least-squares minimisation of a system of equations that might not necessarily have an exact solution.

Proposition.

For any

m\times n

matrix

A

,

\|Ax-b\|_{2}\geq \|Az-b\|_{2}

where

z=A^{+}b

.

Proof. We first note that (stating the complex case), using the fact that $P=AA^{+}$ satisfies $PA=A$ and $P=P^{*}$ , we have

{\begin{alignedat}{2}A^{*}(Az-b)&=A^{*}(AA^{+}b-b)\\&=A^{*}(Pb-b)\\&=A^{*}P^{*}b-A^{*}b\\&=(PA)^{*}b-A^{*}b\\&=0\end{alignedat}}

so that ( ${\text{c.c.}}$ stands for the Hermitian conjugate of the previous term in the following)

{\begin{alignedat}{2}\|Ax-b\|_{2}^{2}&=\|Az-b\|_{2}^{2}+(A(x-z))^{*}(Az-b)+{\text{c.c.}}+\|A(x-z)\|_{2}^{2}\\&=\|Az-b\|_{2}^{2}+(x-z)^{*}A^{*}(Az-b)+{\text{c.c.}}+\|A(x-z)\|_{2}^{2}\\&=\|Az-b\|_{2}^{2}+\|A(x-z)\|_{2}^{2}\\&\geq \|Az-b\|_{2}^{2}\end{alignedat}}

as claimed. $\square$

Remark.

This lower bound need not be zero as the system

Ax=b

may not have a solution (e.g. when the matrix A does not have full rank or the system is overdetermined). If

A

is injective i.e. one-to-one (which implies

m\geq n

), then the bound is attained uniquely at

z

.

Minimum-norm solution to a linear system

The proof above also shows that if the system $Ax=b$ is satisfiable i.e. has a solution, then necessarily $z=A^{+}b$ is a solution (not necessarily unique). We can say more:

Proposition.

If the system

Ax=b

is satisfiable, then

z=A^{+}b

is the unique solution with smallest Euclidean norm.

Proof. Note first, with $Q=A^{+}A$ , that $Qz=A^{+}AA^{+}b=A^{+}b=z$ and that $Q^{*}=Q$ . Therefore, assuming that $Ax=b$ , we have

{\begin{aligned}z^{*}(x-z)&=(Qz)^{*}(x-z)\\&=z^{*}Q(x-z)\\&=z^{*}\left(A^{+}Ax-z\right)\\&=z^{*}\left(A^{+}b-z\right)\\&=0.\end{aligned}}

Thus

{\begin{alignedat}{2}\|x\|_{2}^{2}&=\|z+(x-z)\|_{2}^{2}\\&=\|z\|_{2}^{2}+z^{*}(x-z)+{\overline {z^{*}(x-z)}}+\|x-z\|_{2}^{2}\\&=\|z\|_{2}^{2}+\|x-z\|_{2}^{2}\\&\geq \|z\|_{2}^{2}\end{alignedat}}

with equality if and only if $x=z$ , as was to be shown. $\square$

An immediate consequence of this result is that $z$ is also the uniquely smallest solution to the least-squares minimization problem for all $Ax=b$ , including when $A$ is neither injective nor surjective. It can be shown that the least-squares approximation $Az=y\approx b$ is unique. Thus it is necessary and sufficient for all $x$ that solve the least-squares minimization to satisfy $Ax=y=Az=AA^{+}b$ . This system always has a solution (not necessarily unique) as $Az$ lies in the column space of $A$ . From the above result the smallest $x$ which solves this system is $A^{+}(AA^{+}b)=A^{+}b=z$ .

Notes

↑ Some authors use slightly different dimensions for the factors. The two definitions are equivalent.

References

Ben-Israel, Adi; Greville, Thomas N.E. (2003). Generalized inverses: Theory and applications (2nd ed.). New York, NY: Springer. doi:10.1007/b97366. ISBN 978-0-387-00293-4.
Campbell, S. L.; Meyer, Jr., C. D. (1991). Generalized Inverses of Linear Transformations. Dover. ISBN 978-0-486-66693-8.
Nakamura, Yoshihiko (1991). Advanced Robotics: Redundancy and Optimization. Addison-Wesley. ISBN 978-0201151985.
Rao, C. Radhakrishna; Mitra, Sujit Kumar (1971). Generalized Inverse of Matrices and its Applications. New York: John Wiley & Sons. pp. 240. ISBN 978-0-471-70821-6.

Lie algebras

Let $V$ be a vector space. $(V,[,])$ is called a Lie algebra if it is equipped with the bilinear operator $V\times V\to V$ , denoted by $[,]$ , subject to the properties: for every $x,y,z\in V$

(i) [x, x] = 0
(ii) [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0

(ii) is called the Jacobi identity.

Example: For $x,y\in \mathbf {R} ^{3}$ , define $[x,y]=x\times y$ , the cross product of $x$ and $y$ . The known properties of the cross products show that $(R^{3},[,])$ is a Lie algebra.

Example: Let $\operatorname {Der} (V)=\{D\in \operatorname {Ext} (V):D(xy)=(Dx)y+xDy\}$ . A member of $\operatorname {Der} (V)$ is called a derivation. Define $[x,y]=xy-yx$ . Then $[x,y]\in \operatorname {Der} (V)$ .

Theorem Let $V$ be a finite-dimensional vector space.

(i) If ${\mathfrak {g}}\subset {\mathfrak {gl}}_{k}(V)$ is a Lie algebra consisting of nilpotent elements, then there exists $v\in V$ such that $x(v)=0$ for every $x\in {\mathfrak {g}}$ .
(ii) If ${\mathfrak {g}}$ is solvable, then there exists a common eigenvalue $v\in V$ .

Theorem (Engel) ${\mathfrak {g}}$ is nilpotent if and only if $\operatorname {ad} (x)$ is nilpotent for every $x\in {\mathfrak {g}}$ .
Proof: The direct part is clear. For the converse, note that from the preceding theorem that $\operatorname {ad} ({\mathfrak {g}})$ is a subalgebra of ${\mathfrak {n}}_{k}$ . Thus, $\operatorname {ad} ({\mathfrak {g}})$ is nilpotent and so is ${\mathfrak {g}}$ . $\square$

Theorem ${\mathfrak {g}}$ is solvable if and only if $[{\mathfrak {g}},{\mathfrak {g}}]$ is nilpotent.
Proof: Suppose ${\mathfrak {g}}$ is solvable. Then $\operatorname {ad} [{\mathfrak {g}},{\mathfrak {g}}]$ is a subalgebra of ${\mathfrak {b}}_{k}$ . Thus, $\operatorname {ad} [{\mathfrak {g}},{\mathfrak {g}}]\subset {\mathfrak {n}}_{k}$ . Hence, $\operatorname {ad} [{\mathfrak {g}},{\mathfrak {g}}]$ is nilpotent, and so $[{\mathfrak {g}},{\mathfrak {g}}]$ is nilpotent. For the converse, note the exact sequence:

0\longrightarrow [{\mathfrak {g}},{\mathfrak {g}}]\longrightarrow {\mathfrak {g}}\longrightarrow {\mathfrak {g}}/{[{\mathfrak {g}},{\mathfrak {g}}]}\longrightarrow 0

Since both $[{\mathfrak {g}},{\mathfrak {g}}]$ and ${\mathfrak {g}}/[{\mathfrak {g}},{\mathfrak {g}}]$ are solvable, ${\mathfrak {g}}$ is solvable. $\square$

3 Therorem (Weyl's theorem) Every representation of a finite-dimensional semisimple Lie algebra:

{\mathfrak {g}}\to \operatorname {End} (V)

is completely reducible.
Proof: It suffices to prove that every ${\mathfrak {g}}$ -submodule has a ${\mathfrak {g}}$ -submodule complement. Furthermore, the proof reduces to the case when $W$ is simple (as a module) and has codimension one. Indeed, given a ${\mathfrak {g}}$ -submodule $W$ , let $E\subset \operatorname {Hom} (V,W)$ be the subspace consisting of elements $f$ such that $f|_{W}$ is a scalar multiplication. Since any commutator of elements $f\in E$ is zero (that is, multiplication by zero), it is clear that $E/[E,E]$ has dimension 1. $E$ may not be simple, but by induction on the dimension of $E$ , we can assume that. Hence, $E$ has complement of dimension 1, which is spanned by, say, $f$ . It follows that $V$ is the direct sum of $W$ and the kernel of $f$ . Now, to complete the proof, let $W$ be a simple ${\mathfrak {g}}$ -submodule of codimension 1. Let $c$ be a Casimir element of ${\mathfrak {g}}\to \operatorname {End} (V)$ . It follows that $V$ is the direct sum of $W$ and the kernel of $c$ . $\square$ (TODO: obviously, the proof is very sketchy; we need more details.)

References

Algebraic Groups and Arithmetic Groups

Sheaf theory

We say ${\mathcal {F}}$ is a pre-sheaf on a topological space $X$ if

(i) ${\mathcal {F}}(U)$ is an abelian group for every open subset $U\subset X$
(ii) For each inclusion $U\hookrightarrow V$ , we have the group morphism $\rho _{V,U}:{\mathcal {F}}(V)\to {\mathcal {F}}(U)$ such that
$\rho _{U,U}$ is the identity and $\rho _{W,U}=\rho _{V,U}\circ \rho _{W,V}$ for any inclusion $U\hookrightarrow V\hookrightarrow W$

A pre-sheaf is called a sheaf if the following "gluing axiom" holds:

For each open subset

U

and its open cover

U_{j}

, if

f_{j}\in {\mathcal {F}}(U_{j})

are such that

f_{j}=f_{k}

in

U_{j}\cap U_{k}

, then there exists a unique

f\in {\mathcal {F}}(U)

such that

f|_{U_{j}}=f_{j}

for all

j

.

Note that the uniqueness implies that if $f,g\in {\mathcal {F}}(U)$ and $f|_{U_{j}}=g|_{U_{j}}$ for all $j$ , then $f=g$ . In particular, $f|_{U_{j}}=0$ for all $j$ implies $f=0$ .

4 Example: Let $G$ be a topological group (e.g., $\mathbf {R}$ ). Let ${\mathcal {F}}(U)$ be the set of all continuous maps from open subsets $U\subset X$ to $G$ . Then ${\mathcal {F}}$ forms a sheaf. In particular, suppose the topology for $G$ is discrete. Then ${\mathcal {F}}$ is called a constant sheaf.

Given sheaves ${\mathcal {F}}$ and ${\mathcal {G}}$ , a sheaf morphism $\phi :{\mathcal {F}}\to {\mathcal {G}}$ is a collection of group morphisms $\phi _{U}:{\mathcal {F}}(U)\to {\mathcal {G}}(U)$ satisfying: for every open subset $U\subset V$ ,

\phi _{U}\circ \rho _{V,U}=\rho _{V,U}\circ \phi _{V}

where the first $\rho _{V,U}$ is one that comes with ${\mathcal {F}}$ and the second ${\mathcal {G}}$ .

Define $(\operatorname {ker} \phi )(U)=\operatorname {ker} \phi _{U}$ for each open subset $U$ . $\operatorname {ker} \phi$ is then a sheaf. In fact, suppose $f_{j}\in \operatorname {ker} \phi _{U_{j}}$ . Then there is $f\in {\mathcal {F}}(U)$ such that $f|_{U_{j}}=f_{j}$ . But since

(\phi _{U}f)|_{U_{j}}=\phi _{U_{j}}(f|_{U_{j}})=\phi _{U_{j}}f_{j}=0

for all $j$ , we have $\phi _{U}f=0$ . Unfortunately, $\operatorname {im} \phi$ does not turn out to be a sheaf if it is defined in the same way. We thus define $(\operatorname {im} \phi )(U)$ to be the set of all $f\in {\mathcal {G}}(U)$ such that there is an open cover $U_{j}$ of $U$ such that $f|_{U_{j}}$ is in the image of $\phi _{U_{j}}$ . This is a sheaf. In fact, as before, let $f\in {\mathcal {G}}(U)$ be such that $f|_{U_{j}}\in \operatorname {im} \phi _{U_{j}}$ . Then we have an open cover of $U$ such that $f$ restricted to each member $V$ of the cover is in the image of $\phi _{V}$ .

Let ${\mathcal {F}}^{0},{\mathcal {F}}^{1},{\mathcal {F}}^{2}$ be sheaves on the same topological space.

A sheaf ${\mathcal {F}}$ on $X$ is said to be flabby if $\rho _{X,U}:{\mathcal {F}}(X)\to {\mathcal {F}}(U)$ is surjective. Let ${\mathcal {F}}_{p}=\lim _{U\ni p}{\mathcal {F}}(U)$ , and, for each $f\in {\mathcal {F}}(U)$ , define $\operatorname {supp} f=\{x\in U|f|_{p}\neq 0\}$ . $\operatorname {supp} f$ is closed since $f|_{p}=0$ implies $p$ has a neighborhood of $U$ such that $f|_{q}=0$ for every $q\in U$ . Define $\operatorname {Supp} {\mathcal {F}}=\{x\in X|{\mathcal {F}}_{x}\neq 0\}$ . In particular, if $i:Z\hookrightarrow X$ is a closed subset and $\operatorname {Supp} {\mathcal {F}}\subset Z$ , then the natural map ${\mathcal {F}}\to i_{*}i^{-1}{\mathcal {F}}$ is an isomorphism.

4 Theorem Suppose

0\longrightarrow {\mathcal {F}}^{0}\longrightarrow {\mathcal {F}}^{1}\longrightarrow {\mathcal {F}}^{2}\longrightarrow 0

is exact. Then, for every open subset $U$

0\longrightarrow \Gamma _{Z}(U,{\mathcal {F}}^{0})\longrightarrow \Gamma _{Z}(U,{\mathcal {F}}^{1})\longrightarrow \Gamma _{Z}(U,{\mathcal {F}}^{2})

is exact. Furthermore, $\Gamma _{Z}(U,{\mathcal {F}}^{1})\to \Gamma _{Z}(U,{\mathcal {F}}^{2})$ is surjective if ${\mathcal {F}}^{0}$ is flabby.
Proof: That the kernel of $\operatorname {ker} {\mathcal {F}}^{0}\longrightarrow {\mathcal {F}}^{1}$ is trivial means that $\operatorname {ker} {\mathcal {F}}^{0}(U)\longrightarrow {\mathcal {F}}^{1}(U)$ has trivial kernel for any $U$ . Thus the first map is clear. Next, denoting ${\mathcal {F}}^{1}\to {\mathcal {F}}^{2}$ by $d$ , suppose $f\in {\mathcal {F}}^{1}(U)$ with $df=0$ . Then there exists an open cover $U_{j}$ of $U$ and $u_{j}\in {\mathcal {F}}(U_{j})$ such that $du_{j}=f|_{U_{j}}$ . Since $du_{j}=f=du_{k}$ in $U_{j}\cap U_{k}$ and $d_{U_{j}\cap U_{k}}$ is injective by the early part of the proof, we have $u_{j}=u_{k}$ in $U_{j}\cap U_{k}$ and so we get $u\in {\mathcal {F}}(U)$ such that $du=f$ . Finally, to show that the last map is surjective, let $f\in {\mathcal {F}}^{2}(U)$ , and $\Omega =\{(U,u)|du=f|_{U}\}$ . If $\{(U_{j},u_{j})|j\in J\}\subset \Omega$ is totally ordered, then let $U=\cup _{j}U_{j}$ . Since $u_{j}$ agree on overlaps by totally ordered-ness, there is $u\in {\mathcal {F}}(U)$ with $u|_{U_{j}}=u_{j}$ . Thus, $(U,u)$ is an upper bound of the collection $(U_{j},u_{j})$ . By Zorn's Lemma, we then find a maximal element $(U_{0},u_{0})$ . We claim $U_{0}=U$ . Suppose not. Then there exists $(U_{1},u_{1})$ with $du_{1}=f|_{U_{1}}$ . Since $d(u_{0}-u_{1})=0$ in $U_{0}\cap U_{1}$ , by the early part of the proof, there exists $a\in {\mathcal {F}}^{0}(U_{0}\cap U_{1})$ with $da=u_{0}-u_{1}$ . Then $d(u_{1}+da)=du_{1}=f|_{U_{1}}$ (so $(U_{1},u_{1})\in \Omega$ ) while $u_{1}+da=u_{0}$ in $U_{0}\cap U_{1}$ . This contradicts the maximality of $(U_{0},u_{0})$ . Hence, we conclude $U_{0}=U$ and so $du_{0}=f$ . $\square$

4 Corollary

0\longrightarrow {\mathcal {F}}^{0}\longrightarrow {\mathcal {F}}^{1}\longrightarrow {\mathcal {F}}^{2}\longrightarrow 0

is exact if and only if

0\longrightarrow {\mathcal {F}}_{p}^{0}\longrightarrow {\mathcal {F}}_{p}^{1}\longrightarrow {\mathcal {F}}_{p}^{2}\longrightarrow 0

is exact for every $p\in X$ .

Suppose $f:X\to Y$ is a continuous map. The sheaf $f_{*}{\mathcal {F}}$ (called the pushforward of ${\mathcal {F}}$ by $f$ ) is defined by $f_{*}{\mathcal {F}}(U)={\mathcal {F}}(f^{-1}(U))$ for an open subset $U\subset Y$ . Suppose $f:Y\to X$ is a continuous map. The sheaf $f^{-1}{\mathcal {F}}$ is then defined by $f^{-1}{\mathcal {F}}(U)=$ the sheafification of the presheaf $U\mapsto \varinjlim _{V\supset f(U)}{\mathcal {F}}(V)$ where $V$ is an open subset of $X$ . The two are related in the following way. Let $U\subset X$ be an open subset. Then $f^{-1}f_{*}{\mathcal {F}}(U)$ consists of elements $f$ in ${\mathcal {F}}(f^{-1}(V))$ where $V\supset f(U)$ . Since $f^{-1}(V)\supset U$ , we find a map