Statistics/Distributions/NegativeBinomial

Negative Binomial Distribution

edit

Just as the Bernoulli and the Binomial distribution are related in counting the number of successes in 1 or more trials, the Geometric and the Negative Binomial distribution are related in the number of trials needed to get 1 or more successes.

The Negative Binomial distribution refers to the probability of the number of times needed to do something until achieving a fixed number of desired results. For example:

  • How many times will I throw a coin until it lands on heads for the 10th time?
  • How many children will I have when I get my third daughter?
  • How many cards will I have to draw from a pack until I get the second Joker?

Just like the Binomial Distribution, the Negative Binomial distribution has two controlling parameters: the probability of success p in any independent test and the desired number of successes m. If a random variable X has Negative Binomial distribution with parameters p and m, its probability mass function is:

 .

Example

edit

A travelling salesman goes home if he has sold 3 encyclopedias that day. Some days he sells them quickly. Other days he's out till late in the evening. If on the average he sells an encyclopedia at one out of ten houses he approaches, what is the probability of returning home after having visited only 10 houses?

Answer:

The number of trials X is Negative Binomial distributed with parameters p=0.1 and m=3, hence:

 .

Mean

edit

The mean can be derived as follows.

 
 
 
 

Now let s = r+1 and w=x-1 inside the summation.

 
 

We see that the summation is the sum over a the complete pmf of a negative binomial random variable distributed NB(s,p), which is 1 (and can be verified by applying Newton's generalized binomial theorem).

 

Variance

edit

We derive the variance using the following formula:

 

We have already calculated E[X] above, so now we will calculate E[X2] and then return to this variance formula:

 
 
 
 

Again, let let s = r+1 and w=x-1.

 
 
 

The first summation is the mean of a negative binomial random variable distributed NB(s,p) and the second summation is the complete sum of that variable's pmf.

 
 

We now insert values into the original variance formula.