Hypergeometric
Notation
h
(
k
)
=
(
m
k
)
(
N
−
m
n
−
k
)
(
N
n
)
{\displaystyle h(k)={{{m \choose k}{{N-m} \choose {n-k}}} \over {N \choose n}}}
Parameters
???
Support
???
Unknown type
???
CDF
???
Mean
n
m
N
{\displaystyle {nm \over N}}
Median
???
Mode
???
Unknown type
n
m
N
(
1
−
n
N
)
(
1
−
m
−
1
N
−
1
)
{\displaystyle {nm \over N}\left(1-{n \over N}\right)\left(1-{m-1 \over N-1}\right)}
Skewness
???
Ex. kurtosis
???
Entropy
???
MGF
???
CF
???
PGF
???
Fisher information
???
The hypergeometric distribution describes the number of successes in a sequence of n draws without replacement from a population of N that contained m total successes.
Its probability mass function is:
f
(
x
)
=
(
m
x
)
(
N
−
m
n
−
x
)
(
N
n
)
for all
x
∈
[
0
,
n
]
{\displaystyle f(x)={{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}{\text{ for all }}x\in [0,n]}
Technically the support for the function is only where x∈[max(0, n+m-N), min(m, n)] . In situations where this range is not [0,n] , f(x)=0 since for k>0 ,
(
0
k
)
=
0
{\displaystyle {0 \choose k}=0}
.
Probability Density Function Edit
We first check to see that f(x) is a valid pmf. This requires that it is non-negative everywhere and that its total sum is equal to 1. The first condition is obvious. For the second condition we will start with Vandermonde's identity
∑
x
=
0
n
(
a
x
)
(
b
n
−
x
)
=
(
a
+
b
n
)
{\displaystyle \sum _{x=0}^{n}{a \choose x}{b \choose n-x}={a+b \choose n}}
∑
x
=
0
n
(
a
x
)
(
b
n
−
x
)
(
a
+
b
n
)
=
1
{\displaystyle \sum _{x=0}^{n}{{a \choose x}{b \choose n-x} \over {a+b \choose n}}=1}
We now see that if a=m and b=N-m that the condition is satisfied.
We derive the mean as follows:
E
[
X
]
=
∑
x
=
0
n
x
⋅
f
(
x
;
n
,
m
,
N
)
=
∑
x
=
0
n
x
⋅
(
m
x
)
(
N
−
m
n
−
x
)
(
N
n
)
{\displaystyle \operatorname {E} [X]=\sum _{x=0}^{n}x\cdot f(x;n,m,N)=\sum _{x=0}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}}
E
[
X
]
=
0
⋅
(
m
0
)
(
N
−
m
n
−
0
)
(
N
n
)
+
∑
x
=
1
n
x
⋅
(
m
x
)
(
N
−
m
n
−
x
)
(
N
n
)
{\displaystyle \operatorname {E} [X]=0\cdot {{{m \choose 0}{{N-m} \choose {n-0}}} \over {N \choose n}}+\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}}
We use the identity
(
a
b
)
=
a
b
(
a
−
1
b
−
1
)
{\displaystyle {\binom {a}{b}}={\frac {a}{b}}{\binom {a-1}{b-1}}}
in the denominator.
E
[
X
]
=
0
+
∑
x
=
1
n
x
⋅
(
m
x
)
(
N
−
m
n
−
x
)
N
n
(
N
−
1
n
−
1
)
{\displaystyle \operatorname {E} [X]=0+\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {{N \over n}{{N-1} \choose {n-1}}}}}
E
[
X
]
=
n
N
∑
x
=
1
n
x
⋅
(
m
x
)
(
N
−
m
n
−
x
)
(
N
−
1
n
−
1
)
{\displaystyle \operatorname {E} [X]={n \over N}\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}}
Next we use the identity
b
(
a
b
)
=
a
(
a
−
1
b
−
1
)
{\displaystyle b{\binom {a}{b}}=a{\binom {a-1}{b-1}}}
in the first binomial of the numerator.
E
[
X
]
=
n
N
∑
x
=
1
n
m
(
m
−
1
x
−
1
)
(
N
−
m
n
−
x
)
(
N
−
1
n
−
1
)
{\displaystyle \operatorname {E} [X]={n \over N}\sum _{x=1}^{n}{m{{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}}
Next, for the variables inside the sum we define corresponding prime variables that are one less. So N′=N−1 , m′=m−1 , x′=x−1 , n′=n-1 .
E
[
X
]
=
m
n
N
∑
x
′
=
0
n
′
(
m
′
x
′
)
(
N
′
−
m
′
n
′
−
x
′
)
(
N
′
n
′
)
{\displaystyle \operatorname {E} [X]={mn \over N}\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}}
E
[
X
]
=
m
n
N
∑
x
′
=
0
n
′
f
(
x
′
;
n
′
,
m
′
,
N
′
)
{\displaystyle \operatorname {E} [X]={mn \over N}\sum _{x'=0}^{n'}f(x';n',m',N')}
Now we see that the sum is the total sum over a Hypergeometric pmf with modified parameters. This is equal to 1. Therefore
E
[
X
]
=
n
m
N
{\displaystyle \operatorname {E} [X]={nm \over N}}
We first determine E(X2 ).
E
[
X
2
]
=
∑
x
=
0
n
f
(
x
;
n
,
m
,
N
)
⋅
x
2
=
∑
x
=
0
n
(
m
x
)
(
N
−
m
n
−
x
)
(
N
n
)
⋅
x
2
{\displaystyle \operatorname {E} [X^{2}]=\sum _{x=0}^{n}f(x;n,m,N)\cdot x^{2}=\sum _{x=0}^{n}{{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}\cdot x^{2}}
E
[
X
2
]
=
(
m
0
)
(
N
−
m
n
−
0
)
(
N
n
)
⋅
0
2
+
∑
x
=
1
n
(
m
x
)
(
N
−
m
n
−
x
)
(
N
n
)
⋅
x
2
{\displaystyle \operatorname {E} [X^{2}]={{{m \choose 0}{{N-m} \choose {n-0}}} \over {N \choose n}}\cdot 0^{2}+\sum _{x=1}^{n}{{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}\cdot x^{2}}
E
[
X
2
]
=
0
+
∑
x
=
1
n
m
(
m
−
1
x
−
1
)
(
N
−
m
n
−
x
)
N
n
(
N
−
1
n
−
1
)
⋅
x
{\displaystyle \operatorname {E} [X^{2}]=0+\sum _{x=1}^{n}{{m{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N \over n}{{N-1} \choose {n-1}}}}\cdot x}
E
[
X
2
]
=
m
n
N
∑
x
=
1
n
(
m
−
1
x
−
1
)
(
N
−
m
n
−
x
)
(
N
−
1
n
−
1
)
⋅
x
{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\sum _{x=1}^{n}{{{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}\cdot x}
We use the same variable substitution as when deriving the mean.
E
[
X
2
]
=
m
n
N
∑
x
′
=
0
n
′
(
m
′
x
′
)
(
N
′
−
m
′
n
′
−
x
′
)
(
N
′
n
′
)
(
x
′
+
1
)
{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}(x'+1)}
E
[
X
2
]
=
m
n
N
[
∑
x
′
=
0
n
′
(
m
′
x
′
)
(
N
′
−
m
′
n
′
−
x
′
)
(
N
′
n
′
)
x
′
+
∑
x
′
=
0
n
′
(
m
′
x
′
)
(
N
′
−
m
′
n
′
−
x
′
)
(
N
′
n
′
)
]
{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}x'+\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}\right]}
The first sum is the expected value of a hypergeometric random variable with parameteres (n',m',N'). The second sum is the total sum that random variable's pmf.
E
[
X
2
]
=
m
n
N
[
n
′
m
′
N
′
+
1
]
{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[{n'm' \over N'}+1\right]}
E
[
X
2
]
=
m
n
N
[
(
n
−
1
)
(
m
−
1
)
(
N
−
1
)
+
1
]
=
m
n
N
[
(
n
−
1
)
(
m
−
1
)
+
(
N
−
1
)
(
N
−
1
)
]
{\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[{(n-1)(m-1) \over (N-1)}+1\right]={mn \over N}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]}
We then solve for the variance
Var
(
X
)
=
E
[
X
2
]
−
(
E
[
X
]
)
2
{\displaystyle \operatorname {Var} (X)=\operatorname {E} [X^{2}]-(\operatorname {E} [X])^{2}}
Var
(
X
)
=
m
n
N
[
(
n
−
1
)
(
m
−
1
)
+
(
N
−
1
)
(
N
−
1
)
]
−
(
m
n
N
)
2
{\displaystyle \operatorname {Var} (X)={mn \over N}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]-\left({mn \over N}\right)^{2}}
Var
(
X
)
=
N
m
n
N
2
[
(
n
−
1
)
(
m
−
1
)
+
(
N
−
1
)
(
N
−
1
)
]
−
(
N
−
1
)
(
m
n
)
2
(
N
−
1
)
N
2
{\displaystyle \operatorname {Var} (X)={Nmn \over N^{2}}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]-{(N-1)(mn)^{2} \over (N-1)N^{2}}}
Var
(
X
)
=
n
m
(
N
−
n
)
(
N
−
m
)
N
2
(
N
−
1
)
{\displaystyle \operatorname {Var} (X)={nm(N-n)(N-m) \over N^{2}(N-1)}}
or, equivalently,
Var
(
X
)
=
n
m
N
(
1
−
n
N
)
(
1
−
m
−
1
N
−
1
)
{\displaystyle \operatorname {Var} (X)={nm \over N}\left(1-{n \over N}\right)\left(1-{m-1 \over N-1}\right)}