# Probability/Print version

Template:Printable

1. usually the first few letters in alphabetical order , e.g. $A,B$ and $C$ , or $S$ (mnemonic for set)
2. usually the last few letters in alphabetical order), e.g. $X,Y$ and $Z$ 3. It is not $\exp(\lambda )$ .
4. For numbers, they can be just typed out.
5. This makes the symbol bigger than that using <math display=inline>[/itex], and thus is clearer.
6. However, conventional notations should take precedenece. E.g., we should use $\mu$ , instead of $m$ , for mean.
7. e.g., A $\neq$ a
8. Notice that we cannot regard the situation as placing three indistinguishable balls (representing three groups) into four distinguishable cells (representing four people) with capacity one (since every person can only assigned to one group). Since with this consideration, it is impossible to assign a group to every person (we can only assign groups to three people). Also, with such consideration, one ball can only be put in one cell, which implies one group can only contain one person, but this is not the case.
9. One may prove this by contrapositive: Assume that $\mathbb {P} (\varnothing )\neq 0$ . Then, by the nonegativity of probability, this means $\mathbb {P} (\varnothing )>0$ . Then, $\sum _{i=1}^{\infty }\mathbb {P} (\varnothing )=\lim _{n\to \infty }(n\underbrace {\mathbb {P} (\varnothing )} _{>0})=\infty$ (that is, the sum diverges). So, $\sum _{i=1}^{\infty }\mathbb {P} (\varnothing )\neq 0$ .
10. This is true in general, for selection of distinguishable objects. This is because every unordered selection corresponds to a fixed number of ordered selection. However, this is not true if the objects are indistinguishable, since the unordered selections may correspond to different numbers of ordered selections. This is the same reason for showing that we cannot use the division counting principle to derive the formula for placing $r$ indistinguishable balls into $n$ distinguishable cells with unlimited capacity, based on that for $r$ distinguishable balls.
11. Graphically,
      .           .
.         .
.       .
*-----------*
|###########|
*--------*##|
|////////|##|
*----*///|##|  ...
|....|///|##|
|....|///|##|
*----*---*--*
E_1
E_2
E_3 ...
..
.. : F_1

//
// : F_2

##
## : F_3

12. Alternatively, we can define the events as $\{i{\text{th Bernoulli trial is a failure}}\}.$ 13. 'indpt.' stands for independence.
14. This is because there is unordered selection of (distinguishable and ordered) ${\color {darkgreen}r}$ trials for 'success' without replacement from ${\color {blue}n}$ trials (then the remaining position is for 'failure').
15. Occurrence of the rare event is viewed as 'success' and non-occurrence of the rare event is viewed as 'failure'.
16. Unlike the outcomes for the binomial distribution, there is only one possible sequence for each ${\color {red}x}$ .
17. There is unordered selection of ${\color {red}x}$ trials for 'failures' (or ${\color {darkgreen}k}-1$ trials for 'successes') from ${\color {red}x}+{\color {darkgreen}k}-1$ trials without replacement
18. The restriction on $k$ is imposed so that the binomial coefficients are defined, i.e. the expression 'makes sense'. In practice, we rarely use this condition directly. Instead, we usually directly determine whether a specific value of $x$ 'makes sense'.
19. It is out of scope for this book.
20. The probability is 'distributed uniformly over an interval'.
21. A random variable following the Cauchy distribution has a relatively high probability to take extreme values, compared with other light-tailed distributions (e.g. the normal distribution). Graphically, the 'tails' (i.e. left end and right end) of the pdf.
22. The case for $a<0$ holds similarly (The inequality sign is in opposite direction, and eventually we will have two negative signs cancelling each other). Also when $a=0$ , the r.v. becomes a non-random constant, and so we are not interested in this case.
23. Then, $p_{1}+p_{2}+\dotsb +p_{k}=1$ .
24. If the object is allocated to a cell other than $i$ th cell, then it is 'failure'
25. The subscript $k$ for ${\mathcal {N}}$ is to emphasize that the distribution is $k$ -dimensional, and is optional.
26. Each of the Bernoulli r.v.'s acts as an indicator for the success of the corresponding trial. Since, there are $n$ independent Bernoulli trials, there are $n$ such indicators.
27. Each geometric r.v. shows the number of failure for the corresponding success.
28. since this probability is unconditional, because the corresponding mean is also unconditional, so that their sum is also unconditional mean (as in the proposition)
29. $X_{1},\dotsc ,X_{n}$ are dependent, but we can still use the linearity of expectation, since it does not require independence.
30. Each geometric r.v. shows the number of failure for the corresponding success.
31. or equivalently, transformation between supports of $\mathbf {X}$ and $\mathbf {Y}$ 