# Probability/Random Variables

## Definition

Definition. (Random variable) Formally, a random variable on a probability space ${\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )}$  is a measurable real function ${\displaystyle X:\Omega \to \mathbb {R} }$  defined on ${\displaystyle \Omega }$  (the set of possibleh outcomes)

Remark.

• The property of measurability means that for each real ${\displaystyle x}$ , the set
${\displaystyle \{X\leq x\}{\overset {\text{ def }}{=}}\{\omega \in \Omega :X(\omega )\leq x\}\in {\mathcal {F}}}$ , i.e. is an event in the probability space.
• Measurability will not be emphasized in this book.
• In some definitions, the codomain of the random variable is defined as ${\displaystyle [-\infty ,\infty ]}$ , namely the extended real number line.
• Usually, a capital letter is used to represent a random variable, and the small corresponding letter is used to represent a value taken by the random variable, e.g. ${\displaystyle x}$  is a value taken by random variable ${\displaystyle X}$ .

Since random variable maps the outcomes in ${\displaystyle \Omega }$  to a certain number, it can quantify the outcomes in ${\displaystyle \Omega }$ , which can be useful. Another function which is related to random variable in some sense is indicator function. It is useful in many situations.

Definition. (Indicator function) For each statement (which is usually an event) ${\displaystyle E}$ , the indicator function is

${\displaystyle \mathbf {1} \{E\}={\begin{cases}1,\quad &E{\text{ is true}};\\0,\quad &E{\text{ is false}}.\\\end{cases}}}$

Example. Let ${\displaystyle X}$  be the number of heads facing up from tossing an unfair coin one time. Then, ${\displaystyle X}$  is a random variable, since

${\displaystyle X({\text{head comes up}})=1,\quad X({\text{tail comes up}})=0.}$

Also, if we let ${\displaystyle Y}$  be the number of times the coin is tossed, then ${\displaystyle Y}$  is still a random variable, since ${\displaystyle \Omega }$  is ${\displaystyle \{{\text{coin is tossed once}}\}}$ , and ${\displaystyle Y({\text{coin is tossed once}})=1}$ , even if ${\displaystyle \Omega }$  only contains one element.

Remark. Actually, ${\displaystyle X=\mathbf {1} \{{\text{head comes up}}\}}$ .

Proposition. (Properties of indicator function) For each event ${\displaystyle E}$ ,

• (Complementary event) ${\displaystyle \mathbf {1} \{E^{c}\}=1-\mathbf {1} \{E\}}$

For each event ${\displaystyle E_{1},E_{2},\ldots }$ ,

• (Intersection of events) ${\displaystyle \mathbf {1} \{E_{1}\cap E_{2}\cap \cdots \}=\mathbf {1} \{E_{1}\}\mathbf {1} \{E_{2}\}\cdots }$
• (Union of events) ${\displaystyle \mathbf {1} \{E_{1}\cup E_{2}\cup \cdots \}=\mathbf {1} \{E_{1}\}+\mathbf {1} \{E_{2}\}+\cdots }$  for mutually exclusive events ${\displaystyle E_{1},E_{2},\ldots }$

Proof. Outline:

Complementary event: ${\displaystyle E_{1}}$  is true ${\displaystyle \Leftrightarrow E_{1}^{c}}$  is false, and ${\displaystyle E_{1}^{c}}$  is true ${\displaystyle \Leftrightarrow E_{1}}$  is false.

Intersection of events: when one of events ${\displaystyle E_{1},E_{2},\ldots ,}$  is false, ${\displaystyle E_{1}\cap E_{2}\cap \cdots }$  is false, and the product at right hand side becomes zero as well.

Union of events: since the events are mutually exclusive, at most one of the events is true, so the sum of the right hand side cannot be larger than 1. Also, if one of the events ${\displaystyle E_{1},E_{2},\ldots }$  is true, then the union of events ${\displaystyle E_{1}\cup E_{2}\cup \cdots }$  is also true, and the sum at right hand side becomes one as well.

${\displaystyle \Box }$

Exercise.

A fair six-faced dice is thrown one time. Define ${\displaystyle X}$  be the number facing up. Which of the following is (are) true?

 ${\displaystyle X=3.5}$ ${\displaystyle X=\mathbf {1} \{{\text{the number facing up}}\}}$ ${\displaystyle X={\text{the number facing up}}}$ ${\displaystyle X=1,2,3,4,5\;{\text{or}}\;6}$ ${\displaystyle \mathbf {1} \{X=0\}=0}$ ${\displaystyle \mathbf {1} \{X=1\}=1}$ ${\displaystyle \mathbf {1} \{X\;{\text{is a nonnegative number}}\}=1}$

## Cumulative distribution function

Definition.

Three examples of cdf, which are illustrated by the red lines and dots between two blue lines.

(Cumulative distribution function) The cumulative distribution function (cdf) of random variable ${\displaystyle X}$  is

${\displaystyle F(x)=\mathbb {P} (X\leq x)=\mathbb {P} (\{\omega \in \Omega :X(\omega )\leq x\})}$

in which ${\displaystyle x\in \mathbb {R} }$ .

Remark.

• Cdf completely determine the random behaviour of a random variable.

Example. Suppose we toss a coin two times, then the sample space ${\displaystyle \Omega }$  is ${\displaystyle \{HH,HT,TH,TT\}}$  in which ${\displaystyle HT}$  means head and tail come up in first and second toss respectively, other notations are defined similarly. If we define ${\displaystyle X}$  to be the number of heads, and

${\displaystyle Y={\begin{cases}1,\quad &{\text{head comes up in both tosses}};\\-1,\quad &{\text{tail comes up in both tosses}};\\0,\quad &{\text{otherwise}}.\end{cases}}}$

Show that the cdf of ${\displaystyle X}$  and ${\displaystyle Y}$  are
${\displaystyle F_{X}(t)={\begin{cases}0,\quad &t<0;\\1/4,\quad &0\leq t<1;\\3/4,\quad &1\leq t<2;\\1,\quad &t\geq 2,\end{cases}}\quad {\text{and}}\quad F_{Y}(t)={\begin{cases}0,\quad &t<-1;\\1/4,\quad &-1\leq t<0;\\3/4,\quad &0\leq t\leq 1;\\1,\quad &t\geq 1.\end{cases}}}$

Proof. For cdf of ${\displaystyle X}$ , first, ${\displaystyle \mathbb {P} (X=0)=\mathbb {P} (TT)=1/4,\mathbb {P} (X=1)=\mathbb {P} (HT\cup TH)=1/2,\mathbb {P} (X=2)=\mathbb {P} (HH)=1/4}$ , ${\displaystyle \mathbb {P} (X=x)=0}$  for each ${\displaystyle x\in \mathbb {R} \setminus \{0,1,2\}}$  and ${\displaystyle \Omega =\{X=0,X=1,X=2\}}$ .

If ${\displaystyle t<0}$ , ${\displaystyle \mathbb {P} (X\leq t)=0}$  since ${\displaystyle \{X\leq t\}\notin \Omega }$ .

If ${\displaystyle 0\leq t<1}$ , ${\displaystyle \mathbb {P} (X\leq t)=1/4}$  since ${\displaystyle X=0\subset \{X\leq t\}}$ .

If ${\displaystyle 1\leq t<2}$ , ${\displaystyle \mathbb {P} (X\leq t)=1/4+1/2=3/4}$  since ${\displaystyle X=0\cup X=1\subset \{X\leq t\}}$ .

If ${\displaystyle t\geq 2}$ , ${\displaystyle \mathbb {P} (X\leq t)=1/4+1/2+1/4=1}$  since ${\displaystyle X=0\cup X=1\cup X=2=\Omega \subset \{X\leq t\}}$ .

Similarly, we can get the desired cdf of ${\displaystyle Y}$ , by considering ${\displaystyle Y\leq t}$  for ${\displaystyle t}$  in different ranges.

${\displaystyle \Box }$

Remark. Graphically, the cdf of ${\displaystyle X}$  and ${\displaystyle Y}$  is step function.

Exercise.

Suppose cdf of a random variable ${\displaystyle X}$  is ${\displaystyle F(x)}$ .

1 Given that ${\displaystyle F(2)=0.1}$ , compute ${\displaystyle \mathbb {P} (X>2)}$ .

 0 0.1 0.9 1

2 In addition to ${\displaystyle F(2)=0.1}$ , it is further given that ${\displaystyle F(3)=0.1}$ . Compute ${\displaystyle \mathbb {P} (2 .

 0 0.01 0.1 0.2 0.8

3 Which of the following is (are) possible?

 ${\displaystyle F(4)=0.05}$ ${\displaystyle F(3.5)=1}$ ${\displaystyle F(1)=0}$ ${\displaystyle F(1)=2}$ ${\displaystyle \mathbb {P} (X=2)=0.1}$

In the following, we will discuss three defining properties of cdf.

Theorem. (Defining properties of cdf) A function ${\displaystyle F}$  is the cdf of a random variable ${\displaystyle X}$  if and only if

(i) ${\displaystyle 0\leq F(x)\leq 1}$  for each real number ${\displaystyle x}$ .

(ii) ${\displaystyle F}$  is nondecreasing.

(iii) ${\displaystyle F}$  is right-continuous.

Proof. Only if part (${\displaystyle F}$  is cdf ${\displaystyle \Rightarrow }$  these three properties):

(i) It follows the axioms of probability since ${\displaystyle F}$  is defined to be a probability.

(ii)

{\displaystyle {\begin{aligned}x\leq y&\Rightarrow \{X\leq x\}\subseteq \{X\leq y\}\\&\Rightarrow \mathbb {P} (X\leq x)\leq \mathbb {P} (X\leq y)&\qquad {\text{by monotonicity}}\\&\Rightarrow F(x)\leq F(y)&\qquad {\text{by definition}}\\\end{aligned}}}

(iii) Fix an arbitrary positive sequence ${\displaystyle \epsilon _{1}>\epsilon _{2}>\cdots }$  with ${\displaystyle \lim _{n\to \infty }\epsilon _{n}=0}$ . Define ${\displaystyle E_{n}=\{X\leq x+\epsilon _{n}\}}$  for each positive number ${\displaystyle n}$ . It follows that ${\displaystyle E_{1}\supset E_{2}\supset \cdots }$ . Then,

${\displaystyle \mathbb {P} (X\leq x)=\mathbb {P} \underbrace {\left(\lim _{n\to \infty }E_{n}\right)} _{\{X\leq x+0\}}=\mathbb {P} \left(\lim _{n\to \infty }E_{1}\cap E_{2}\cap \cdots E_{n}\right)=\lim _{n\to \infty }\mathbb {P} (E_{1}\cap \cdots \cap E_{n})=\lim _{n\to \infty }\mathbb {P} (E_{n})=\lim _{n\to \infty }\mathbb {P} (X\leq x+\epsilon _{n})}$

It follows that
${\displaystyle F(x)=\lim _{n\to \infty }F(x+\epsilon _{n})}$

for each ${\displaystyle \epsilon _{1}>\epsilon _{2}>\cdots }$  with ${\displaystyle \epsilon _{n}\to 0}$  as ${\displaystyle n\to \infty }$ . That is,
${\displaystyle \lim _{h\to 0^{+}}F(x+h)=F(x)}$

which is the definition of right-continuity.

If part is more complicated. The following is optional. Outline:

1. Draw an arbitrary curve satisfying the three properties.
2. Throw a fair coin infinitely many times.
3. Encode each result into a binary number, e.g. ${\displaystyle HHT\cdots \to 0.110\ldots }$
4. Transform each binary number to a decimal number, e.g. ${\displaystyle 0.110\ldots \to 1(2^{-1})+1(2^{-2})=0.75\ldots }$ . Then, the decimal number is a random variable ${\displaystyle U\in [0,1]}$ .
5. Use this decimal number as the input of the inverse function of the arbitrarily drawn curve, and we get a value, which is also a random variable, say ${\displaystyle X}$ .
6. Then, we obtain a cdf of the random variable ${\displaystyle X}$  ${\displaystyle F(x)=\mathbb {P} (X\leq x)=\mathbb {P} (U\leq F(x))}$ , if we throw a fair coin infinitely many times.

${\displaystyle \Box }$

Sometimes, we are only interested in the values ${\displaystyle x}$  such that ${\displaystyle \mathbb {P} (X=x)\neq 0}$ , which are more 'important'. Roughly speaking, the values are actually the elements of the support of ${\displaystyle X}$ , which is defined in the following.

Definition. (Support of random variable) The support of a random variable ${\displaystyle X}$ , ${\displaystyle \operatorname {supp} (X)}$ , is the smallest closed set ${\displaystyle S}$  such that ${\displaystyle \mathbb {P} (X\in S)=1}$ .

Remark.

• E.g. closed interval is closed set.
• Closedness will not be emphasized in this book.
• Practically, ${\displaystyle \operatorname {supp} (X)=\{x\in \mathbb {R} :f(x)>0\}}$  (which is the smallest closed set).
• ${\displaystyle f(x)}$  is probability mass function for discrete random variables;
• ${\displaystyle f(x)}$  is probability density function for continuous random variables.
• The terms mentioned above will be defined later.

Example. If

${\displaystyle \mathbb {P} (X=x)={\begin{cases}1/4,\quad &x=0;\\1/8,\quad &x=3;\\5/8,\quad &x=6;\\0&{\text{otherwise}},\\\end{cases}}}$

then ${\displaystyle \operatorname {supp} (X)=\{0,3,6\}}$ , since ${\displaystyle \mathbb {P} (X\in \{0,3,6\})=1}$  and this set is the smallest set among all sets satisfying this requirement.

Remark. ${\displaystyle \mathbb {R} ,\{0,1,2,3,4,5,6\},}$  etc. also satisfy the requirement, but they are not the smallest set.

Exercise.

Suppose we throw an unfair coin. Define ${\displaystyle X=1}$  if head comes up and ${\displaystyle X=-1}$  otherwise. Let ${\displaystyle F(x)}$  be the cdf of ${\displaystyle X}$ .

1 Find ${\displaystyle \operatorname {supp} (\mathbf {1} \{X=1\})}$ .

 ${\displaystyle \{-1,1\}}$ ${\displaystyle \{0,1\}}$ ${\displaystyle \{\mathbf {1} \{X=1\}=0,\mathbf {1} \{X=1\}=1\}}$ It cannot be determined since the probability that head comes up is not given.

2 Suppose ${\displaystyle \mathbb {P} (X=1)=0.7}$ , compute ${\displaystyle F(0)}$ .

 0 0.3 0.5 0.7 1

3 Suppose ${\displaystyle \mathbb {P} (X=1)=p\in (0,1)}$ . Which of the following is (are) true?

 ${\displaystyle F(1)=1}$ ${\displaystyle F(-1)=0}$ ${\displaystyle F(0)+F(-1)=F(1)}$ ${\displaystyle F(1)=2F(0.5)}$  if the coin is fair instead. ${\displaystyle \lim _{x\to -1^{-}}F(x)=F(-1)}$

## Discrete random variables

Definition. (Discrete random variables) If ${\displaystyle \operatorname {supp} (X)}$  is countable (i.e. 'enumerable' or 'listable'), then the random variable ${\displaystyle X}$  is a discrete random variable.

Example. Let ${\displaystyle X}$  be the number of successes among ${\displaystyle n}$  Bernoulli trials. Then, ${\displaystyle X}$  is a discrete random variable, since ${\displaystyle \operatorname {supp} (X)=\{0,1,\ldots ,n\}}$  which is countable.

On the other hand, if we let ${\displaystyle Y}$  be the temperature on Celsius scale, ${\displaystyle Y}$  is not discrete, since ${\displaystyle \operatorname {supp} (Y)=[\underbrace {-273.15} _{\text{absolute zero}},\underbrace {1.417\times 10^{32}} _{\text{Planck temperature}}]}$  which is not countable.

Exercise.

Which of the following is (are) discrete random variable?

 Number of heads coming up from tossing a coin three times. A number lying between 0 and 1 inclusively. Number of correct option(s) in a multiple choice question in which there are at most three correct options. Answer to a short question asking for a numeric answer. Probability for a random variable to be discrete random variable.

Often, for discrete random variable, we are interested in the probability that the random variable takes a specific value. So, we have a function that gives the corresponding probability for each specific value taken, namely probability mass function.

Definition.

An example of pmf. This function is called probability mass function, since the value at each point may be interpreted as the mass of the dot located at that point.

(Probability mass function) Let ${\displaystyle X}$  be a discrete random variable. The probability mass function (pmf) of ${\displaystyle X}$  is

${\displaystyle f({\color {green}x})=\mathbb {P} (X={\color {green}x}).}$

Remark.

• Alternative names include mass function and probability function.
• If random variable ${\displaystyle X}$  is discrete, then ${\displaystyle \operatorname {supp} (X)=\{x\in \mathbb {R} :f(x)>0\}}$  (it is closed).
• The cdf of random variable ${\displaystyle X}$  is ${\displaystyle F(x)=\mathbb {P} (X\leq x)=\sum _{\{y:y\leq x\}}f(y)}$ . It follows that the sum of the value of pmf at each ${\displaystyle x}$  inside the support equals one.
• The cdf of a discrete random variable ${\displaystyle X}$  is a step function with jumps at the points in ${\displaystyle \operatorname {supp} (X)}$ , and the size of each jump defines the pmf of ${\displaystyle X}$  at the corresponding point in ${\displaystyle \operatorname {supp} (X)}$ .

Example. Suppose we throw a fair six-faced dice one time. Let ${\displaystyle X}$  be the number facing up. Then, pmf of ${\displaystyle X}$  is

${\displaystyle f(x)={\begin{cases}1/6,\quad &x=1,2,3,4,5{\text{ or }}6;\\0&{\text{otherwise}}.\end{cases}}}$

Exercise.

1 Which of the following is (are) pmf?

 ${\displaystyle f(x)={\begin{cases}1/2^{n},\quad &n\in \mathbb {N} \\0&{\text{otherwise}}\end{cases}}}$ . It is given that ${\displaystyle \mathbb {N} =\{1,2,\ldots \}}$  is countable. ${\displaystyle f(x)={\begin{cases}1,\quad &0\leq x\leq 1\\0&{\text{otherwise}}\end{cases}}}$ ${\displaystyle f(x)={\begin{cases}0.2,\quad &x=2\\0.3,\quad &x=6\\0.4,\quad &x=8\\0&{\text{otherwise}}\end{cases}}}$ ${\displaystyle f(x)={\begin{cases}0.2,\quad &x=2\\0.3,\quad &x=6\\0.4,\quad &x=8\\0.1&{\text{otherwise}}\end{cases}}}$ ${\displaystyle f(x)={\frac {\mathbf {1} \{x=2\cup x=3\cup x=4\}}{3}}}$

2 Compute ${\displaystyle k}$  such that the function ${\displaystyle f(x)=\mathbf {1} \{x=k\}k+2(\mathbf {1} \{x=2k\}k)+3(\mathbf {1} \{x=3k\}k)}$  is a pmf.

 ${\displaystyle 1/12}$ ${\displaystyle 1/6}$ ${\displaystyle 1/3}$ ${\displaystyle 1}$

## Continuous random variables

Suppose ${\displaystyle X}$  is a discrete random variable. Partitioning ${\displaystyle S}$  into small disjoint intervals ${\displaystyle [x_{1},x_{1}+\Delta x_{1}],\dotsc }$  gives

${\displaystyle \mathbb {P} (X\in S)=\mathbb {P} \left(X\in \bigcup _{i}[x_{i}+\Delta x_{i}]\right)=\sum _{i}\mathbb {P} {\big (}X\in [x_{i}+x_{i}+\Delta x_{i}]{\big )}=\sum _{i}\underbrace {\frac {\mathbb {P} {\big (}X\in [x_{i}+x_{i}+\Delta x_{i}]{\big )}}{\Delta x_{i}}} _{\text{probability per unit}}\cdot \Delta x_{i}.}$

In particular, the probability per unit can be interpreted as the density of the probability of ${\displaystyle X}$  over the interval. (The higher the density, the more probability is distributed (or allocated) to that interval).

Taking limit,

${\displaystyle \lim _{\Delta x_{i}\to 0}\sum _{i}\underbrace {\frac {\mathbb {P} {\big (}X\in [x_{i}+x_{i}+\Delta x_{i}]{\big )}}{\Delta x_{i}}} _{\text{density}}\cdot \Delta x_{i}=\int _{S}\underbrace {f(x)} _{\text{density}}\,dx,}$

in which, intuitively and non-rigorously, ${\displaystyle f(x)\,dx}$  can be interpreted as the probability over 'infinitesimal' interval ${\displaystyle [x,x+dx]}$ , i.e. ${\displaystyle \mathbb {P} (X\in [x,dx])}$ , and ${\displaystyle f(x)}$  can be interpreted as the density of the probability over the 'infinitesimal' interval, i.e. ${\displaystyle {\frac {\mathbb {P} (X\in [x,dx])}{dx}}}$ .

These motivate us to have the following definition.

Definition. (Continuous random variable) A random variable ${\displaystyle X}$  is continuous if

${\displaystyle \mathbb {P} (X\in S)=\int _{S}f(x)\,dx}$

for each (measurable) set ${\displaystyle S\subseteq \mathbb {R} }$  and for some nonnegative function ${\displaystyle f}$ .

Remark.

• The function ${\displaystyle f}$  is called probability density function (pdf), density function, or probability function (rarely).
• If ${\displaystyle X}$  is continuous, then the value of pdf at each single value is zero, i.e. ${\displaystyle \mathbb {P} (X=x)=0}$  for each real number ${\displaystyle x}$ .
• This can be seen by setting ${\displaystyle S=\{x\}}$ , then ${\displaystyle \int _{S}f(u)\,du=\int _{x}^{x}f(u)\,du=0}$  (dummy variable is changed).
• By setting ${\displaystyle S=(-\infty ,x]}$ , the cdf ${\displaystyle F(x)=\mathbb {P} {\big (}X\in (-\infty ,x]{\big )}=\int _{-\infty }^{x}f(u)\,du}$ .
• Measurability will not be emphasized. The sets encountered in this book are all measurable.
• ${\displaystyle \int _{S}f(x)\,dx}$  is the area of pdf under ${\displaystyle S}$ , which represents probability (which is obtained by integrating the density function over the set ${\displaystyle S}$ ).

The name continuous r.v. comes from the result that the cdf of this kind of r.v. is continuous.

Proposition. (Continuity of cdf of continuous random variable) If a random variable ${\displaystyle X}$  is continuous, its cdf ${\displaystyle F}$  is also continuous (not just right-continuous).

Proof. Since ${\displaystyle \lim _{h\to 0}F(x+h)=\lim _{h\to 0}\int _{-\infty }^{x+h}f(u)\,du=\int _{-\infty }^{x}f(x)\,dx=F(x)}$  (Riemann integral is continuous), the cdf is continuous.

${\displaystyle \Box }$

Example. (Exponential distribution) The function ${\displaystyle F(x)=(1-e^{-\lambda x})\mathbf {1} \{x\geq 0\}}$  is a cdf of a continuous random variable since

• It is nonnegative.
• ${\displaystyle \int _{-\infty }^{\infty }(1-e^{-\lambda x})\mathbf {1} \{x\geq 0\}\,dx=\int _{0}^{\infty }(1-e^{-\lambda x})\,dx=1-(1-\underbrace {e^{0}} _{1})=1}$ . So, ${\displaystyle \lim _{x\to \infty }F(x)=1}$ .
• It is nondecreasing.
• It is right-continuous (and also continuous).

Exercise.

1 Which of the following is (are) pdf?

 ${\displaystyle f(x)=\mathbf {1} \{x\geq 0\}/x}$ ${\displaystyle f(x)=\mathbf {1} \{x\geq 0\}/x^{2}}$ ${\displaystyle f(x)=\mathbf {1} \{3\leq x\leq 8\}/5}$ ${\displaystyle f(x)=\mathbf {1} \{0\leq x\leq 1\}x}$ ${\displaystyle f(x)=\mathbf {1} \{0\leq x\leq {\sqrt {2}}\}({\sqrt {2}}-x)}$

2 Compute ${\displaystyle k}$  such that the function ${\displaystyle f(x)=k\mathbf {1} \{0\leq x\leq k/4\}x}$  is a pdf.

 ${\displaystyle 1}$ ${\displaystyle 2^{1/3}}$ ${\displaystyle {\sqrt {2}}}$ ${\displaystyle 2}$ There does not exist such ${\displaystyle k}$ .

3 Compute ${\displaystyle k}$  such that the function ${\displaystyle F(x)=k\mathbf {1} \{0\leq x\leq k/4\}x}$  is a cdf.

 ${\displaystyle 1}$ ${\displaystyle 2^{1/3}}$ ${\displaystyle {\sqrt {2}}}$ ${\displaystyle 2}$ There does not exist such ${\displaystyle k}$ .

4 Which of the following is (are) true?

 If the support of a random variable is countable, then it is discrete. If the support of a random variable is not countable, then it is continuous. If the support of a random variable is not countable, then it is not discrete.

Proposition. (Finding pdf using cdf) If cdf ${\displaystyle F(x)}$  of a continuous random variable is differentiable, then the pdf ${\displaystyle f(x)=F'(x)}$ .

Proof. This follows from fundamental theorem of calculus:

${\displaystyle F'(x)={\frac {d}{dx}}\int _{-\infty }^{x}f(u)\,du=f(x).}$

${\displaystyle \Box }$

Remark. Since ${\displaystyle F(x)}$  is nondecreasing, ${\displaystyle F'(x)\geq 0\Rightarrow f(x)\geq 0}$ . This shows that ${\displaystyle f(x)}$  is always nonnegative if ${\displaystyle F}$  is differentiable. It is a motivation for us to define pdf to be nonnegative.

Without further assumption, pdf is not unique, i.e. a random variable may have multiple pdf's, since, e.g., we may set the value of pdf to be a real number at a single point outside its support (without affecting the probabilities, since the value of pdf at a single point is zero regardless of the value), and this makes another valid pdf for a random variable. To tackle this, we conventionally set ${\displaystyle f(x)=0}$  for each ${\displaystyle x\notin \operatorname {supp} (X)}$  to make the pdf become unique, and make the calculation more convenient.

Example. (Uniform distribution) Given that

${\displaystyle f(x)=\mathbf {1} \{1\leq x\leq 5\}/4}$

is a pdf of a continuous random variable ${\displaystyle X}$ , the probability ${\displaystyle \mathbb {P} (2

Exercise.

It is given that the function ${\displaystyle f(x)=\mathbf {1} \{1\leq x\leq 6\}e^{x}/(e^{6}-e)}$  is a pdf of a continuous random variable ${\displaystyle X}$ .

1 Compute ${\displaystyle \mathbb {P} (X>3)}$ .

 ${\displaystyle {\frac {e^{6}-e^{3}}{e^{6}-e}}}$ ${\displaystyle {\frac {e^{3}-e}{e^{6}-e}}}$ ${\displaystyle {\frac {e^{3}}{e^{6}-e}}}$ ${\displaystyle e^{3}-e}$ ${\displaystyle e^{6}-e^{3}}$

2 Compute ${\displaystyle \mathbb {P} (X>3|X<4)}$ .

 ${\displaystyle {\frac {e^{4}-e}{e^{4}-e^{3}}}}$ ${\displaystyle {\frac {e^{3}-e}{e^{4}-e^{3}}}}$ ${\displaystyle {\frac {e^{4}-e^{3}}{e^{4}-e}}}$ ${\displaystyle {\frac {e^{3}-e}{e^{4}-e}}}$ ${\displaystyle 1}$

3 Compute ${\displaystyle \mathbb {P} (X>3|X\geq 4)}$ .

 ${\displaystyle 1-{\frac {e^{4}-e}{e^{4}-e^{3}}}}$ ${\displaystyle 1-{\frac {e^{3}-e}{e^{4}-e^{3}}}}$ ${\displaystyle 1-{\frac {e^{4}-e^{3}}{e^{4}-e}}}$ ${\displaystyle 1-{\frac {e^{3}-e}{e^{4}-e}}}$ ${\displaystyle 0}$

## Mixed random variables

You may think that a random variable can either be discrete or continuous after reading the previous two sections. Actually, this is wrong. A random variable can be neither discrete nor continuous. An example of such random variable is mixed random variable, which is discussed in this section.

Theorem. (cdf decomposition) The cdf ${\displaystyle F(x)}$  of each random variable ${\displaystyle X}$  can be decomposed as a sum of three components:

${\displaystyle F(x)=\alpha _{d}F_{d}(x)+\alpha _{c}F_{c}(x)+\alpha _{s}F_{s}(x)}$

for some nonnegative constants ${\displaystyle \alpha _{d},\alpha _{c},\alpha _{s}}$  such that ${\displaystyle \alpha _{d}+\alpha _{c}+\alpha _{s}=1}$ , in which ${\displaystyle x}$  is a real number, ${\displaystyle F_{d},F_{c},F_{s}}$  is cdf of discrete, continuous, and singular random variable respectively.

Remark.

• If ${\displaystyle \alpha _{d}\neq 0}$  and ${\displaystyle \alpha _{c}\neq 0}$ , then ${\displaystyle X}$  is a mixed random variable.
• We will not discuss singular random variable in this book, since it is quite advanced.
• One interpretation of this formula is:
${\displaystyle X={\begin{cases}{\text{discrete random variable having cdf }}F_{d}{\text{ with probability }}\alpha _{d};\\{\text{continuous random variable having cdf }}F_{c}{\text{ with probability }}\alpha _{c};\\{\text{singular random variable having cdf }}F_{s}{\text{ with probability }}\alpha _{s}.\end{cases}}}$

• If ${\displaystyle X}$  is discrete (continuous) random variable, then ${\displaystyle \alpha _{c}=\alpha _{s}=0}$  (${\displaystyle \alpha _{d}=\alpha _{s}=0}$ ).
• We may also decompose pdf similarly, but we have different ways to find pdf of discrete and continuous random variable from the corresponding cdf.

An example of singular random variable is the Cantor distribution function (sometimes known as Devil's Staircase), which is illustrated by the following graph. The graph pattern keeps repeating when you enlarge the graph.

Cantor distribution function

Example. Let ${\displaystyle F_{d}(x)={\frac {1}{3}}\mathbf {1} \{x\geq 3\}+{\frac {2}{3}}\mathbf {1} \{x\geq 7\}}$ . Let ${\displaystyle F_{c}(x)=\mathbf {1} \{x\geq 1\}(x-1)/(x+1)}$ . Then, ${\displaystyle F(x)=(1/2)F_{d}(x)+(1/2)F_{c}(x)}$  is a cdf of a mixed random variable ${\displaystyle X}$ , with probability ${\displaystyle 1/2}$  to be discrete and probability ${\displaystyle 1/2}$  to be continuous, since it is nonnegative, nondecreasing, right-continuous and ${\displaystyle \lim _{x\to \infty }F(x)=(1/2)\left[\lim _{x\to \infty }(F_{d}(x)+F_{c}(x)\right]=(1/2)(1+1)=1}$ .

Exercise. Consider the function ${\displaystyle F(x)={\frac {\mathbf {1} \{x\geq 8\}+(1-1/x)\mathbf {1} \{x\geq 1\}}{k}}}$ . It is given that ${\displaystyle F(x)}$  is a cdf of a random variable ${\displaystyle X}$ .

(a) Show that ${\displaystyle k=2}$ .

(b) Show that the pdf of ${\displaystyle X}$  is

${\displaystyle f(x)={\frac {1}{2}}(\mathbf {1} \{x=8\}+x^{-2}\mathbf {1} \{x\geq 1\}).}$

(c) Show that the probability for ${\displaystyle X}$  to be continuous is ${\displaystyle 1/k}$ .

(d) Show that ${\displaystyle \mathbb {P} (X\geq 3|X\leq 8)}$  is ${\displaystyle 2/3}$ .

(e) Show that the events ${\displaystyle \{X\geq 3\}}$  and ${\displaystyle \{X\leq m\}}$  are independent if ${\displaystyle m\geq 8}$  .

Proof.

(a) Since ${\displaystyle F}$  is a cdf, and ${\displaystyle \mathbf {1} \{x\geq 8\}=\mathbf {1} \{x\geq 1\}=1}$  when ${\displaystyle x\to \infty }$ ,

${\displaystyle \lim _{x\to \infty }F(x)=1\implies {\frac {1+1}{k}}=1\implies k=2.}$

(b) Since ${\displaystyle X}$  is a mixed random variable, for the discrete random variable part, the pdf is

${\displaystyle f_{d}(x)=\mathbf {1} \{x=8\}/2.}$

On the other hand, for the continuous random variable part, the pdf is
${\displaystyle f_{c}(x)=\mathbf {1} \{x\geq 1\}x^{-2}/2.}$

Therefore, the pdf of ${\displaystyle X}$  is
${\displaystyle f(x)={\frac {1}{2}}(\mathbf {1} \{x=8\}+x^{-2}\mathbf {1} \{x\geq 1\})}$

(c) We can see that ${\displaystyle F(x)}$  can be decomposed as follows:

${\displaystyle F(x)={\frac {1}{2}}(\mathbf {1} \{x\geq 8\})+{\frac {1}{2}}((1-1/x)\mathbf {1} \{x\geq 1\}).}$

Thus, the probability for ${\displaystyle X}$  to be continuous is ${\displaystyle 1/k=1/2}$ .

(d)

${\displaystyle \mathbb {P} (X\geq 3|X\leq 8)={\frac {\mathbb {P} (3\leq X\leq 8)}{\mathbb {P} (X\leq 8)}}={\frac {\mathbb {P} (X\leq 8)-\mathbb {P} (X\leq 3)+\overbrace {\mathbb {P} (X=3)} ^{0}}{1}}=1-\overbrace {(1-1/3)/2} ^{1/3}=2/3.}$

(e) If ${\displaystyle m\geq 8}$ , ${\displaystyle \mathbb {P} (X\leq m)=1}$ . Thus,

${\displaystyle \mathbb {P} (X\geq 3\cap X\leq m)=\mathbb {P} (X\leq m)-\mathbb {P} (X\leq 3)+\underbrace {\mathbb {P} (X=3)} _{0}=1-\mathbb {P} (X\leq 3)=\mathbb {P} (X>3)=\mathbb {P} (X>3)+\underbrace {\mathbb {P} (X=3)} _{0}=\mathbb {P} (X\geq 3)=\mathbb {P} (X\geq 3)\underbrace {\mathbb {P} (X\leq m)} _{1},}$

i.e. ${\displaystyle \{X\geq 3\}}$  and ${\displaystyle \{X\leq m\}}$  are independent.

${\displaystyle \Box }$