R Programming/Maximum Likelihood

R Basics
Data Management
Graphics
- Grammar of graphics
Publication quality output
Descriptive Statistics
Mathematics
Statistical Core Methods
Regression Models
Time Series
Factor Analysis
Classification
- Ordination
- Clustering
Network Analysis
High Performance Computing
- Profiling R code
- Parallel computing with R
Appendix
- Sources
- Index

Introduction

Maximum likelihood estimation is just an optimization problem. You have to write down your log likelihood function and use some optimization technique. Sometimes you also need to write your score (the first derivative of the log likelihood) and or the hessian (the second derivative of the log likelihood).

One dimension

If there is only one parameter, we can optimize the log likelihood using optimize().

Example with a type 1 Pareto distribution

We provide an example with a type 1 Pareto distribution. Note that in this example we treat the minimum as known and do not estimate it. Therefore this is a one-dimensional problem.

We use the rpareto1() (actuar) function to generate a random vector from a type 1 Pareto distribution with shape equal to 1 and minimum value equal to 500. We use the dpareto1() (actuar) function with option log = TRUE to write the log likelihood. Then we just need to use optimize() with maximum=TRUE. We provide a minimum and a maximum value for the parameter with the interval option.

> library(actuar)
> y <- rpareto1(1000, shape = 1, min = 500)
> ll <- function(mu, x) { 
+    sum(dpareto1(x,mu[1],min = min(x),log = TRUE)) 
+   } 
> optimize(f = ll, x = y, interval = c(0,10), maximum = TRUE)

Multiple dimension

fitdistr() (MASS package) fits univariate distributions by maximum likelihood. It is a wrapper for optim().
If you need to program yourself your maximum likelihood estimator (MLE) you have to use a built-in optimizer such as nlm(), optim(). R also includes the following optimizers :
mle() in the stats4 package
The maxLik package

Example with a logistic distribution

For instance, we draw from a logistic distribution and we estimate the parameters using .

> # draw from a gumbel distribution using the inverse cdf simulation method
> e.1 <- -log(-log(runif(10000,0,1))) 
> e.2 <- -log(-log(runif(10000,0,1)))
> u <- e.2 - e.1  # u follows a logistic distribution (difference between two gumbels.)
> fitdistr(u,densfun=dlogis,start=list(location=0,scale=1))

Example with a Cauchy distribution

For instance, we can write a simple maximum likelihood estimator for a Cauchy distribution using the nlm() optimizer. We first draw a vector x from a Cauchy distribution. Then we define the log likelihood function and then we optimize using the nlm() function. Note that nlm() is minimizer and not a maximizer.

> n <- 100
> x <- rcauchy(n)
> mlog.1 <- function(mu, x) { 
+   - sum(dcauchy(x, location = mu, log = TRUE)) 
+   } 
> mu.start <- median(x)
> out <- nlm(mlog.1, mu.start, x = x)

Example with a beta distribution

Here is an other example with the Beta distribution and the optim() function.

> y <- rbeta(1000,2,2)
> loglik <- function(mu, x) { 
+    sum(-dbeta(x,mu[1],mu[2],log = TRUE)) 
+    } 
> 
> out <- optim(par = c(1,1), fn=loglik,x=y,method = "L-BFGS-B",lower=c(0,0))