Let
{
X
n
k
}
,
k
=
1
,
.
.
.
,
r
n
,
n
=
1
,
2
,
.
.
.
{\displaystyle \{X_{nk}\},k=1,...,r_{n},n=1,2,...}
be a triangular array of Bernoulli random variables with
p
n
k
=
P
[
X
n
k
=
1
]
{\displaystyle p_{nk}=P[X_{nk}=1]}
. Suppose that
∑
k
=
1
r
n
p
n
k
→
λ
and
max
k
≤
r
n
p
n
k
→
0.
{\displaystyle \sum _{k=1}^{r_{n}}p_{nk}\to \lambda \,{\text{ and }}\,\max _{k\leq r_{n}}p_{nk}\to 0.}
Find the limiting distribution of
∑
k
=
1
r
n
X
n
k
{\displaystyle \sum _{k=1}^{r_{n}}X_{nk}}
.
We will show it converges to a Poisson distribution with parameter
λ
{\displaystyle \lambda }
. The characteristic function for the Poisson distribution is
e
λ
(
e
i
t
−
1
)
{\displaystyle e^{\lambda (e^{it}-1)}}
. We show the characteristic function,
E
[
exp
(
i
t
∑
k
=
1
r
n
X
n
k
)
]
{\displaystyle E[\exp(it\sum _{k=1}^{r_{n}}X_{nk})]}
converges to
e
λ
(
e
i
t
−
1
)
{\displaystyle e^{\lambda (e^{it}-1)}}
, which implies the result.
log
E
[
exp
(
i
t
∑
k
=
1
r
n
X
n
k
)
]
=
∑
k
=
1
r
n
log
(
(
1
−
p
n
k
)
+
p
n
k
e
i
t
)
=
∑
k
=
1
r
n
log
(
1
−
p
n
k
(
1
−
e
i
t
)
)
=
∑
k
=
1
r
n
(
−
p
n
k
(
1
−
e
i
t
)
+
O
(
p
n
k
2
)
)
{\displaystyle \log E[\exp(it\sum _{k=1}^{r_{n}}X_{nk})]=\sum _{k=1}^{r_{n}}\log((1-p_{nk})+p_{nk}e^{it})=\sum _{k=1}^{r_{n}}\log(1-p_{nk}(1-e^{it}))=\sum _{k=1}^{r_{n}}(-p_{nk}(1-e^{it})+O(p_{nk}^{2}))}
. By our assumptions, this converges to
λ
(
e
i
t
−
1
)
{\displaystyle \lambda (e^{it}-1)}
.
E
[
(
X
−
E
[
X
|
G
1
]
)
2
]
=
E
[
(
(
X
−
E
[
X
|
G
2
]
)
+
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
)
2
]
{\displaystyle E[(X-E[X|{\mathcal {G}}_{1}])^{2}]=E[((X-E[X|{\mathcal {G}}_{2}])+(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}]))^{2}]}
=
E
[
(
X
−
E
[
X
|
G
2
]
)
2
]
+
E
[
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
2
]
+
2
E
[
(
X
−
E
[
X
|
G
2
]
)
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
]
{\displaystyle =E[(X-E[X|{\mathcal {G}}_{2}])^{2}]+E[(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])^{2}]+2E[(X-E[X|{\mathcal {G}}_{2}])(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])]}
We will show that the third term vanishes. Then since the second term is nonnegative, the result follows.
E
[
(
X
−
E
[
X
|
G
2
]
)
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
]
=
E
[
E
[
(
X
−
E
[
X
|
G
2
]
)
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
|
G
2
]
]
{\displaystyle E[(X-E[X|{\mathcal {G}}_{2}])(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])]=E[E[(X-E[X|{\mathcal {G}}_{2}])(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])|{\mathcal {G}}_{2}]]}
by the law of total probability.
E
[
(
X
−
E
[
X
|
G
2
]
)
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
|
G
2
]
=
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
E
[
(
X
−
E
[
X
|
G
2
]
)
|
G
2
]
{\displaystyle E[(X-E[X|{\mathcal {G}}_{2}])(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])|{\mathcal {G}}_{2}]=(E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])E[(X-E[X|{\mathcal {G}}_{2}])|{\mathcal {G}}_{2}]}
, since
(
E
[
X
|
G
2
]
−
E
[
X
|
G
1
]
)
{\displaystyle (E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{1}])}
is
G
2
{\displaystyle {\mathcal {G}}_{2}}
-measurable.
Finally,
E
[
(
X
−
E
[
X
|
G
2
]
)
|
G
2
]
=
E
[
X
|
G
2
]
−
E
[
E
[
X
|
G
2
]
|
G
2
]
=
E
[
X
|
G
2
]
−
E
[
X
|
G
2
]
=
0
{\displaystyle E[(X-E[X|{\mathcal {G}}_{2}])|{\mathcal {G}}_{2}]=E[X|{\mathcal {G}}_{2}]-E[E[X|{\mathcal {G}}_{2}]|{\mathcal {G}}_{2}]=E[X|{\mathcal {G}}_{2}]-E[X|{\mathcal {G}}_{2}]=0}
Consider a sequence of random variables
X
1
,
X
2
,
…
{\displaystyle X_{1},X_{2},\ldots }
such that
X
n
=
1
or
0
{\displaystyle X_{n}=1{\text{ or }}0}
. Assume
P
[
X
1
=
1
]
≥
α
{\displaystyle P[X_{1}=1]\geq \alpha }
and
P
[
X
n
=
1
|
X
1
,
…
,
X
n
−
1
]
≥
α
>
0
for n=2,3,
…
{\displaystyle P[X_{n}=1|X_{1},\ldots ,X_{n-1}]\geq \alpha >0{\text{ for n=2,3,}}\ldots }
Prove that
(a.)
P
[
X
n
=
1
for some n
]
=
1.
{\displaystyle P[X_{n}=1{\text{ for some n}}]=1.}
(b).
P
[
X
n
=
1
infinitely often
]
=
1.
{\displaystyle P[X_{n}=1{\text{ infinitely often}}]=1.}
We show
P
[
X
n
=
1
finitely often
]
=
0.
{\displaystyle P[X_{n}=1{\text{ finitely often}}]=0.}
. If
X
n
=
1
{\displaystyle X_{n}=1}
for only finitely many
n
{\displaystyle n}
, then there is a largest index
T
{\displaystyle T}
for which
X
T
=
1
{\displaystyle X_{T}=1}
. We show in contrast that for all
T
{\displaystyle T}
,
P
[
X
n
=
0
for all
n
≥
T
]
=
0
{\displaystyle P[X_{n}=0{\text{ for all }}n\geq T]=0}
.
First notice,
P
[
X
1
=
0
]
≤
(
1
−
α
)
{\displaystyle P[X_{1}=0]\leq (1-\alpha )}
and
P
[
X
T
=
0
]
=
E
[
P
[
X
T
=
0
|
X
1
,
X
2
,
…
,
X
T
−
1
]
]
≤
(
1
−
α
)
for T
>
1
{\displaystyle P[X_{T}=0]=E[P[X_{T}=0|X_{1},X_{2},\ldots ,X_{T-1}]]\leq (1-\alpha ){\text{ for T}}>1}
.
Then let
A
n
(
T
)
{\displaystyle A_{n}^{(T)}}
be the event
[
X
T
+
n
−
1
=
…
=
X
T
=
0
]
{\displaystyle [X_{T+n-1}=\ldots =X_{T}=0]}
, then
P
[
X
n
=
0
for all
n
≥
T
]
=
P
[
A
n
(
T
)
occurs for all n
]
{\displaystyle P[X_{n}=0{\text{ for all }}n\geq T]=P[A_{n}^{(T)}{\text{ occurs for all n}}]}
.
Notice
P
[
A
n
(
T
)
]
=
P
[
X
T
+
n
−
1
=
0
|
A
n
−
1
(
T
)
]
P
[
A
n
−
1
(
T
)
]
≤
(
1
−
α
)
P
[
A
n
−
1
(
T
)
]
for n =2,3,
…
{\displaystyle P[A_{n}^{(T)}]=P[X_{T+n-1}=0|A_{n-1}^{(T)}]P[A_{n-1}^{(T)}]\leq (1-\alpha )P[A_{n-1}^{(T)}]{\text{ for n =2,3,}}\ldots }
and
P
[
A
1
(
T
)
]
=
P
[
X
T
=
0
]
≤
(
1
−
α
)
{\displaystyle P[A_{1}^{(T)}]=P[X_{T}=0]\leq (1-\alpha )}
. Therefore
P
[
A
n
(
T
)
]
≤
(
1
−
α
)
n
{\displaystyle P[A_{n}^{(T)}]\leq (1-\alpha )^{n}}
and
lim
n
→
∞
P
[
A
n
(
T
)
]
=
0
{\displaystyle \lim _{n\rightarrow \infty }P[A_{n}^{(T)}]=0}
. So
P
[
X
n
=
0
for all n
≥
T
]
=
0
{\displaystyle P[X_{n}=0{\text{ for all n}}\geq T]=0}
and we reach the desired conclusion.