Set Theory/Ordinals

Finite and transfinite ordinal numbers

Historical context

Preliminaries

Standard representation of ordinal numbers

The definition of ordinal numbers offers little insight into their nature. In situations like this pure mathematicians create representations of the objects they wish to study. Such representations are built from familiar mathematical constructions and are equivalent to the obstruce objects. By manipulating the familiar objects in the representation, the pure mathematician may thus investigate the structure of the mysterious abstract entities.

The most common representation of the ordinal numbers, due to Von Neumann, is as follows. The ordinal 0 is defined to be the empty set $\emptyset$ ; the ordinal 1 is defined to be the set $\{0\}$ , which is of course equal to $\{\emptyset \}$ . Similarly, the ordinal 2 is the set $\{0,1\}$ ; the ordinal 3 is the set $\{0,1,2\}$ ; the ordinal 4 is the set $\{0,1,2,3\}$ . Any finite ordinal $n$ is defined to be the set $\{0,1,2,\ldots ,n-1\}$ (or, in a rigorous notation, the successor of an ordinal α is defined as the set $s(\alpha )=\alpha \cup \{\alpha \}\,$ ).

In fact this definition extends naturally to transfinite ordinals. The ordinal $\omega$ is the set consisting of every finite ordinal $\{0,1,2,3,\ldots \}$ . Again, $\omega +1$ is the set $\{0,1,2,3,\ldots ,\omega \}$ ; $\omega +2$ is the set $\{0,1,2,3,\ldots ,\omega ,\omega +1\}$ ; and so on.

The ordinal $\omega +\omega$ (or $\omega *2$ ) is the set consisting of all finite ordinals and ordinals of the form $\omega +n$ , where $n$ is a finite ordinal. Thus $\omega +\omega =\{0,1,2,\ldots ;\omega ,\omega +1,\omega +2,\ldots \}$ .

The ordinal $\omega _{1}$ is the first uncountable ordinal, and is the set of all countable ordinals.

Well-orderings

The basic property of ordinals is well-ordering. This will allow the above informal discussion to be given a formal definition and additionally allow us to prove some necessary basic facts.

Definition: a total ordering of a set P is a well-ordering of P if every nonempty subset has a minimal element. Definition: an initial segment of a well-ordered set W is a set of the form {w∈W|w<v} for some v∈W.

Well-orderings are a kind of "universal order" in the sense that given any two well-ordered sets, one well-ordered set is order-isomorphic to the other or an initial segment of the other. This will be established in the following few theorems.

Theorem: any increasing function from a well-ordered set to itself is rising: if (W, <) is a well-ordered set and f: W→W is a function such that x>y→f(x)>f(y), then for all w∈W, f(w)≥w.

Proof: if {w∈W: f(w)<w} was nonempty, let v=min {w∈W: f(w)<w}. Then f(v)<v, and because f is increasing, f(f(v)) < f(v), so f(v)∈{w∈W: f(w)<w}. Then f(v)≥v because v is the minimum element of that set, contradicting f(v)<v. Therefore, {w∈W: f(w)<w} must be empty.

Theorem: any order-automorphism on a well-ordered set is the identity.

Proof: Given any automorphism f: W→W, both the function and its inverse must be increasing. By the above theorem, we can conclude:
f(x) ≥ x
and also f^-1(x) ≥ x
f is increasing, so f(f^-1(x)) ≥ f(x)
Simplifying, x ≥ f(x).
f(x) = x

Theorem: if *f* and *g* are both order isomorphisms of well-ordered sets V and W, then f=g.

Proof: g^-1∘f and f∘g^-1 are both automorphisms on V and W respectively, so they are both equal to the identity function.

Theorem: There exists no order isomorphism of a well-ordered set to any of its own initial segments.

Proof: Let (W, <) be a well-ordering and let S_v = {w∈W|w<v} be an initial segment. If f: W→S_v is an order isomorphism, then expanding the range of f to W, f is an increasing function to itself so f(v)≥v. But f(v)∈{w∈W|w<v}, so f(v)<v. So, no such f can exist.

Theorem: The restriction of an order isomorphism f: W→V to an initial segment S_w of W is an order isomorphism of that initial segment to the initial segment S_f(w) of V.

Proof: for any w'∈S_w, f(w')<f(w) so f(w')∈S_f(w), f maps S_w into S_f(w). Furthermore, for any v∈S_f(w), v<f(w) so f^-1(v)<f^-1f(w)) = w, so f^-1(v) is in the domain of the restricted function, where f(f^-1(v)) = v, meaning that the restricted function f also maps onto S_f(w). Therefore, restricting the domain of f to S_w, it is possible to then shrink its range to S_f(w), making it a bijection.

We now finally come to the main idea behind well-orderings.

Theorem: Given any two well-ordered sets, one set is order isomorphic to the other or an initial segment of the other.

Proof: Let (V, <) and (W, <) be two well-ordered sets. First, we establish the following fact using the previous theorem: the set of all w∈W such that the initial segment S_w in W is isomorphic to some initial segment S_v for some v∈V is *closed downwards*. Given any w∈W, if such an isomorphism exists for S_w, then given any w'<w, there also exists a v'∈V such that S_w' is order isomorphic to S_v'.

This is because by the above theorem, restricting the isomorphism to the initial segment of w' within S_v is an order isomorphism of that segment to the initial segment of f(w') within S_w, where f is the order isomorphism. But initial segments within S_v are also initial segments within V, because order is transitive, and the same goes for initial segments within W.

Suppose there existed a v∈V where S_v is not order isomorphic to any S_w for w∈W. Then v belongs to the complement of the above mentioned set and is therefore nonempty: let v₀ be the least such v. Then any v>v₀ will also belong to the complement: if S_v was isomorphic to some S_w, then v₀<v implies the same must be true for v, which it is not.

For any v∈S_v₀, v<v₀, and given that v₀ is the least element for which such an isomorphism does not exist, S_v must be isomorphic to some S_w for w∈W. Furthermore, such a w must be unique, because if S_v was also isomorphic to S_w' for w'∈W, then without loss of generality, suppose w'<w and let f and g be the isomorphisms to S_w and S_w' respectively. Then g∘f^-1: S_w→S_w' is an isomorphism of a well-ordered set to its own initial segment, which was proved to not be possible.

This establishes a function f: S_v₀→W defined by:

f(v) = the unique w∈W such that S__v is order isomorphic to S__w

We show that the range of f is also closed downwards, meaning that if w∈f(V) and w'<w, then w'∈f(V). If w∈f(V), then there exists a v∈V such that f(v) = w, so there exists an order isomorphism g from S__v and S__w. For any w'<w, the restriction of f^-1 to the initial segment by w' in S_v is also an order isomorphism to its initial segment by g^-1(w'). As remarked earlier, initial segments within initial segments are also initial segments of the whole, by the transitive property. So, S_g^-1(w') is order isomorphic to S_w', meaning f(g^-1(w')) = w'. Therefore, w'∈f(V).

Now to prove that f is an onto function. Note that if it was not, then the complement of the image set W-f(V) is nonempty, so let w₀=min W-f(V). w₀ being the minimum means that for any w<w₀, w∈f(V), so S_w₀⊆f(V). Conversely, if w∈f(V), then w≠w₀ because w₀∉f(V). Second, if w>w₀, then because the range is closed downwards, w₀∈f(V), which is false. Therefore, w<w₀, so w∈S_w₀, so f(V)=S_w₀. But this implies that f is an order isomorphism from S_v₀ to S_w₀, contradicting that v₀ is defined to be the least such element of V for which such an isomorphism does not exist. Therefore, W-f(V) must be empty and f is onto.

With all that, f is therefore an order isomorphism between S_v₀ and W.

This takes care of the case where there existed a v∈V where S_v is not order isomorphic to any S_w for w∈W. The same argument goes for the case where there existed a w∈W where S_w is not order isomorphic to any S_v for v∈V by applying the same result, switching V and W around, and finding an isomorphism between V and an initial segment of W.

Finally, suppose neither exists, and for every v∈V, S_vis order isomorphic to S_w for some w∈W, and every S_wis order isomorphic to S_v for some v∈V. Then the function defined, this time as f: V→W with domain as all of V, will be an order isomorphism between V and W.

Note that only one of these three cases can hold, by the impossibility of an order isomorphism of a well-ordered set with its own initial segment.

Transitive sets and ordinals

The basic ordering under the ordinals is the set inclusion relation, ∈. We establish a class of sets such that a<b whenever a∈b. The basic requirement to be an ordinal is transitivity.

Definition: A set T is transitive if S∈T implies S⊆T. Expanding out the definition of subset: a set T is transitive if for all S∈T and R∈S, R∈T.

In a sense, transitive sets are "closed downwards" by the ∈ relation.

Definition: an ordinal is a transitive set that is well-ordered by the ∈ relation.

Keep in mind that well-ordered also implies a linear ordering. The symbols ∈ and < will therefore be used interchangeably for elements of ordinals.

We now establish some basic facts for working with ordinals.

Theorem: ∅ is an ordinal.
Proof: it is transitive and well-ordered by ∈ in the vacuous sense.

Theorem: If β∈α and α is an ordinal, then β is an ordinal.
Proof: α is transitive so β⊆α, therefore β is also well-ordered by ∈. If γ∈β then γ∈α because β⊆α, and furthermore γ⊆α because α is transitive. Then if δ∈γ, then δ∈α. So δ, γ, and β are all elements of α and δ∈γ∈β, so δ∈β because ∈ is a transitive relation in α.

Theorem: If β⊆α and β≠α and α and β are both ordinals, then β∈α.
Proof: α-β is nonempty and a subset of α, let γ = min(α-β).

If δ∈γ, then δ<γ=min(α-β), so δ∉α-β, meaning δ∈β. Conversely, if δ∈β, then δ∉α-β, and also δ∈α because β⊆α, and α is totally ordered so either δ>γ or δ<γ because they are both elements of α. The first case is actually impossible: γ<δ (or γ∈δ) together with δ∈β implies that γ∈β because β is a transitive set, but γ is supposed to be an element of α-β. Therefore, δ<γ or δ∈γ.

Therefore, γ=β, and γ∈α-β, so γ∈α.

Theorem: If α and β are ordinals, one must be a subset of the other.
Proof: The intersection α∩β is also an ordinal: as a subset of α, it is well-ordered by ∈. It is transitive because if γ∈α∩β, then γ⊆α and γ⊆β, so γ⊆α∩β.

α∩β must be equal to α or β. Suppose this was not the case: then α∩β⊆α but α∩β≠α so α∩β∈α by the preceding theorem. Likewise, α∩β⊆β but α∩β≠β so α∩β∈β. But α∩β∈α∩β, contradicting that α is well-ordered by ∈ (in particular, ∈ must be a strict order, so it has the anti-reflexive property).

The following corollaries are then easily established.

Theorem: If α and β are ordinals and α≠β, then α∈β or β∈α. From here on, denote α∈β also as α<β. ∈ is therefore a total ordering of the class of all ordinals.

Proof: First, verify that ∈ satisfies the transitive and anti-reflexive properties. If α is an ordinal and α∈α, then elements of α need to be well-ordered, in particular anti-reflexive by ∈, which is violated by α∈α. So α∉α for all ordinals. Second, if α, β, and γ are ordinals and α<β, β<γ, then β∈γ and α∈β, therefore α∈γ because γ is a transitive set.

That ∈ is a total ordering is a consequence of the above theorem.

Now verify well-orderedness. If C is a nonempty class of ordinals, let α∈C. If α is the minimum of C, then C has a minimum. Else, there exists a β<α where β∈C. Then α∩C is a nonempty subset of α and has a least element, so let γ=min α∩C. Then for all δ∈C, from what was just proven, either δ=α, δ<α, or δ>α. In the first case, γ∈α so γ<α=δ. In the second case, δ∈α so δ∈α∩C, so γ≤δ. In the third case, γ<α and α<δ, so γ<δ. In all three cases, γ≤δ.

Theorem: If α is an ordinal, then α = {β | β < α}

Proof: From definition of < between ordinals as ∈.

Theorem: The intersection of a nonempty class of ordinals is another ordinal and in fact the greatest upper bound (inf) of that class. Henceforth, it shall be denoted inf.

Proof: Let C be a class of ordinals. Suppose β∈∩C. Then for any γ∈β and δ∈C, β∈δ and δ is an ordinal, so γ∈δ. Therefore, γ∈∩C, so ∩C is transitive. Also, ∩C is a subset of an ordinal, and is therefore well-ordered by ∈.

Theorem: The union of a set of ordinals is another ordinal, and in fact the least upper bound (sup) of that set. Henceforth, the union of any set of ordinals shall be denoted sup.

Proof: Let S be a set of ordinals, and let α∈∪S. Then there exists β∈S, α∈β. β is an ordinal and is transitive, so for any γ∈α, γ∈β, thus γ∈∪S. Thus, α⊆∪S, so ∪S is transitive. Furthermore, all elements of ∪S are ordinals, and because all ordinals are well-ordered under < (meaning ∈), any set of ordinals, in particular ∪S is well-ordered under <.

Theorem: If α is an ordinal, then α∪{α} is an ordinal. If β>α is an ordinal greater than α, then β≥α∪{α}.

Proof: α∪{α} is a set of ordinals, and so is well-ordered under <. It is transitive: if β∈α∪{α}, either β∈α or β=α. In the first case, if γ∈β, then γ∈α and γα∪{α}, so β⊆α∪{α}. In the second case, β=α⊆α∪{α}.

Denote by S(α)=α∪{α}. S(α) is called the successor of α because by the above, there are no ordinals between them: no ordinal β satisfies α<β<S(α) for any ordinal α. Given any α, if there exists an ordinal β such that α=S(β), then α is called a successor ordinal. Else, α is called a limit ordinal.

The class of all ordinals Ord is not a set. If it was a set, sup Ord is an ordinal and sup Ord≥α for any ordinal. But S(sup Ord) is also an ordinal and S(sup Ord)>sup Ord, a contradiction.

We finally come to the main point of ordinals: as "universal well-ordered functions".

Theorem: every well-ordered set is order isomorphic to a unique ordinal.

Proof: Let W be a well-ordered set. Consider the formula "F(w) = α when S_w (the initial segment in W by w) is order isomorphic to the ordinal α".

First, note that for a given w∈W, if such an ordinal exists it is unique (otherwise an ordinal would be isomorphic to its own initial segment), and it also exists for all w'<w, and in fact the order isomorphism on S_w equals F, restricted to S_w. This is because if o is an order isomorphism between S_w and α, then for any w'<w, the restriction of o to S_w' is an order isomorphism between S_w' and o(w')<α. So F(w') = o(w') < F(w). This also shows that F (and also F^-1) is monotone when it is defined.

The reverse is also true: if α=F(w) for some w∈W and β<α, then there exists a w'<w such that β=F(w'). This is because as remarked before, F restricted to S_w is an order isomorphism between S_w and α, and in particular it is surjective. Furthermore, note that for any w∈W, if F is defined for all w'<w, then it is defined on w and in fact equal to F(S_w). First, F is monotone, so it is an isomorphism between its domain an image, so all that's needed is to prove that F(S_w) is an ordinal. It is a set of ordinals and thus well-ordered by ∈. Also if α ∈ F(S_w) and β<α, then β ∈ F(S_w) by what was just proven, so F(S_w) is a transitive set. Therefore, F(S_w) is an ordinal.

This shows that in fact, F is defined on all of W: otherwise, let v = min { w∈W | S_w is not isomorphic to any ordinal }. Then F is defined on all v'<v, so it is defined on v by the above paragraph, a contradiction.

Now adjoin to W one element that is greater than all elements of W, call it m. The resulting set remains a well-ordered set. Then the above statements also apply to this new well-ordered set, and so F(m) is defined it, meaning S_m is order isomorphic to the ordinal F(m). But W=S(m).

First few ordinals

Define 0 = ∅. Then define the successors of 0:

1 = S(0), 2 = S(1), 3 = S(2), and so on.

0 is a limit ordinal. Denote by ω the first nonzero limit ordinal. The existence of ω follows from the axiom of infinity. Ordinals that are less than ω are called finite or natural numbers. The set of natural numbers is sometimes denoted N, which is also equal to ω.

Induction, recursion

Well-orderedness allows "counting upwards." This is given in the induction principle:

If a class C satisfies 3 properties:

1. 0∈C 2. α∈C → S(α)∈C 3. α is limit ordinal and (β<α→β∈C) together imply α∈C

Then C contains all ordinals. This follows from well-ordering: if Ord-C is nonempty, then min(Ord-C) is either 0, a successor ordinal, or a nonzero limit ordinal. Then either 0∉C (violating 1), S(α)=min(Ord-C) for some α implying α∈ while S(α)∉C (violating 2), or min(Ord-C) is a limit ordinal and all β<min(Ord-C) are elements of C by minimality, violating 3.

One main purpose of induction is defining recursive functions, namely proving that such a recursive function exist. When recursion is done on ordinals, it is often called "transfinite recursion." The main idea is that given a "class function" G that takes in functions with domain being an ordinal, there exists a unique class function F such that F(α)=G(F restricted to α) for all ordinals α. Note that "restricted to α" means a domain restricted to all elements of α, meaning all ordinals less than α.

Proof of transfinite recursion:

Ordinal arithmetic

Addition, multiplication, and exponentiation can now be defined using transfinite recursion.

← Zorn's Lemma and the Axiom of Choice · Cardinals →