Let be a probability space, and let be fixed, such that . If is another set, then the conditional probability of where already has occurred (or occurs with certainty) is defined as
.
Using multiplicative notation, we could have written
instead.
This definition is intuitive, since the following lemmata are satisfied:
Lemma 3.2:
Lemma 3.3:
Each lemma follows directly from the definition and the axioms holding for (definition 2.1).
From these lemmata, we obtain that for each , satisfies the defining axioms of a probability space (definition 2.1).
With this definition, we have the following theorem:
Theorem 3.4 (Multiplication formula):
,
where is a probability space and are all in .
Proof:
From the definition, we have
for all . Thus, as is an algebra, we obtain by induction:
(note that by using the -notation, we assume that the union is disjoint), where are all contained within . Then
.
Proof:
where we used that the sets are all disjoint, the distributive law of the algebra and .
Theorem 3.6 (Retarded Bayes' theorem):
Let be a probability space and . Then
.
Proof:
.
This formula may look somewhat abstract, but it actually has a nice geometrical meaning. Suppose we are given two sets , already know , and , and want to compute . The situation is depicted in the following picture:
We know the ratio of the size of to , but what we actually want to know is how compares to . Hence, we change the 'comparitant' by multiplying with , the old reference magnitude, and dividing by , the new reference magnitude.