# Econometric Theory/Asymptotic Convergence

## Asymptotic Convergence

### Modes of Convergence

#### Convergence in Probability

Convergence in probability is going to be a very useful tool for deriving asymptotic distributions later on in this book. Alongside convergence in distribution it will be the most commonly seen mode of convergence.

##### Definition

A sequence of random variables $\{X_{n};n=1,2,\cdots \}$  converges in probability to $X_{}$  if:

 $\forall \epsilon ,\delta >0,$ $\exists N\;\operatorname {s.t.} \;\forall n\geq N,$ $\Pr\{|X_{n}-X|>\delta \}<\epsilon$ an equivalent statement is:

 $\forall \delta >0,$ $\lim _{n\to \infty }\Pr\{|X_{n}-X|>\delta \}=0$ This will be written as either $X_{n}{\begin{matrix}{\begin{matrix}{}_{p}\\\longrightarrow \\{}\end{matrix}}\end{matrix}}X$  or $\operatorname {plim} X_{n}=X$ .

##### Example

$X_{n}={\begin{cases}\eta &1-{\begin{matrix}{\frac {1}{n}}\end{matrix}}\\\theta &{\begin{matrix}{\frac {1}{n}}\end{matrix}}\end{cases}}$

We'll make an intelligent guess that this series converges in probability to the degenerate random variable $\eta$ . So we have that:

$\forall \delta >0,\;\Pr\{|X_{n}-\eta |>\delta \}\leq \Pr\{|X_{n}-\eta |>0\}=\Pr\{X_{n}=\theta \}={\begin{matrix}{\frac {1}{n}}\end{matrix}}$

Therefore our definition for convergence in probability in this case is:

 $\forall \epsilon ,\delta >0,$ $\exists N\quad \operatorname {s.t.} \forall n\geq N,$ $\Pr\{|X_{n}-\eta |>\delta \}\leq \Pr\{|X_{n}-\eta |>0\}=\Pr\{X_{n}=\theta \}={\begin{matrix}{\frac {1}{n}}\end{matrix}}<\epsilon$ So for any positive values of $\epsilon \in \mathbb {R}$  we can always find an $N\in \mathbb {N}$  large enough so that our definition is satisfied. Therefore we have proved that $X_{n}{\begin{matrix}{}_{p}\\\longrightarrow \\{}\end{matrix}}\eta$ .

#### Convergence Almost Sure

Almost-sure convergence has a marked similarity to convergence in probability, however the conditions for this mode of convergence are stronger; as we will see later, convergence almost surely actually implies that the sequence also converges in probability.

##### Definition

A sequence of random variables $\{X_{n};n=1,2,\cdots \}$  converges almost surely to the random variable $X$  if:

 $\forall \delta >0,$ $\lim _{n\to \infty }\Pr\{\bigcup _{m\geq n}|X_{m}-X|>\delta ,\}=0$ equivalently

 $\Pr\{\lim _{n\to \infty }X_{n}=X\}=1$ Under these conditions we use the notation $X_{n}{\begin{matrix}{\begin{matrix}{}_{a.s.}\\\longrightarrow \\{}\end{matrix}}\end{matrix}}X$  or $\lim _{n\to \infty }X_{n}=X\operatorname {a.s.}$ .

##### Example

Let's see if our example from the convergence in probability section also converges almost surely. Defining:

$X_{n}={\begin{cases}\eta &1-{\begin{matrix}{\frac {1}{n}}\end{matrix}}\\\theta &{\begin{matrix}{\frac {1}{n}}\end{matrix}}\end{cases}}$

we again guess that the convergence is to $\eta$ . Inspecting the resulting expression we see that:

 $\Pr\{\lim _{n\to \infty }X_{n}=\eta \}=1-\Pr\{\lim _{n\to \infty }X_{n}\neq \eta \}=1-\Pr\{\lim _{n\to \infty }X_{n}=\theta \}\geq 1-\lim _{n\to \infty }{\begin{matrix}{\frac {1}{n}}\end{matrix}}=1$ Thereby satisfying our definition of almost-sure convergence.

#### Convergence in Distribution

Convergence in distribution will appear very frequently in our econometric models through the use of the Central Limit Theorem. So let's define this type of convergence.

##### Definition

A sequence of random variables $\{X_{n};n=1,2,\cdots \}$  asymptotically converges in distribution to the random variable $X$  if $F_{X_{n}}(\zeta )\rightarrow F_{X}(\zeta )$  for all continuity points. $F_{X_{n}}(\zeta )$  and $F_{X_{}}(\zeta )$  are the cumulative density functions of $X_{n}$  and $X$  respectively.

It is the distribution of the random variable that we are concerned with here. Think of a students-T distribution: as the degrees of freedom, $n$ , increases our distribution becomes closer and closer to that of a gaussian distribution. Therefore the random variable $Y_{n}\sim t(n)$  converges in distribution to the random variable $Y\sim N(0,1)$  (n.b. we say that the random variable $Y_{n}{\begin{matrix}{}_{d}\\\longrightarrow \\{}\end{matrix}}Y$  as a notational crutch, what we really should use is $f_{Y_{n}}(\zeta ){\begin{matrix}{}_{d}\\\longrightarrow \\{}\end{matrix}}f_{Y}(\zeta )$ /

##### Example

Let's consider the distribution Xn whose sample space consists of two points, 1/n and 1, with equal probability (1/2). Let X be the binomial distribution with p = 1/2. Then Xn converges in distribution to X.

The proof is simple: we ignore 0 and 1 (where the distribution of X is discontinuous) and prove that, for all other points a, $\lim F_{X_{n}}(a)=F_{X}(a)\,$ . Since for a < 0 all Fs are 0, and for a > 1 all Fs are 1, it remains to prove the convergence for 0 < a < 1. But $F_{X_{n}}(a)={\frac {1}{2}}([a\geq {\frac {1}{n}}]+[a\geq 1])$  (using Iverson brackets), so for any a chose N > 1/a, and for n > N we have:

$n>1/a\rightarrow a>1/n\rightarrow [a\geq {\frac {1}{n}}]=1\land [a\geq 1]=0\rightarrow F_{X_{n}}(a)={\frac {1}{2}}\,$

So the sequence $F_{X_{n}}(a)\,$  converges to $F_{X}(a)\,$  for all points where FX is continuous.

#### Convergence in R-mean Square

Convergence in R-mean square is not going to be used in this book, however for completeness the definition is provided below.

##### Definition

A sequence of random variables $\{X_{n};n=1,2,\cdots \}$  asymptotically converges in r-th mean (or in the $L^{r}$  norm) to the random variable $X$  if, for any real number $r>0$  and provided that $E(|X_{n}|^{r})<\infty$  for all n and $r\geq 1$ ,

$\lim _{n\to \infty }E\left(\left\vert X_{n}-X\right\vert ^{r}\right)=0.$

#### Cramer-Wold Device

The Cramer-Wold device will allow us to extend our convergence techniques for random variables from scalars to vectors.

##### Definition

A random vector $\mathbf {X} _{n}{\begin{matrix}{}_{d}\\\longrightarrow \\{}\end{matrix}}\mathbf {X} \;\iff \;{\mathbf {\lambda } }^{\operatorname {T} }\mathbf {X} _{n}{\begin{matrix}{}_{d}\\\longrightarrow \\{}\end{matrix}}{\mathbf {\lambda } }^{\operatorname {T} }\mathbf {X} \quad \forall \lVert \mathbf {\lambda } \rVert \neq 0$ .

### Central Limit Theorem

Let $\ X_{1},X_{2},X_{3},...$  be a sequence of random variables which are defined on the same probability space, share the same probability distribution D and are independent. Assume that both the expected value μ and the standard deviation σ of D exist and are finite.

Consider the sum $\ S_{n}=X_{1}+...+X_{n}$ . Then the expected value of $\ S_{n}$  is nμ and its standard error is σ n1/2. Furthermore, informally speaking, the distribution of Sn approaches the normal distribution N(nμ,σ2n) as n approaches ∞.