Last modified on 25 March 2014, at 16:01

R Programming/Maximum Likelihood

IntroductionEdit

Maximum likelihood estimation is just an optimization problem. You have to write down your log likelihood function and use some optimization technique. Sometimes you also need to write your score (the first derivative of the log likelihood) and or the hessian (the second derivative of the log likelihood).

One dimensionEdit

If there is only one parameter, we can optimize the log likelihood using optimize().

Example with a type 1 Pareto distributionEdit

We provide an example with a type 1 Pareto distribution. Note that in this example we treat the minimum as known and do not estimate it. Therefore this is a one-dimensional problem.

We use the rpareto1() (actuar) function to generate a random vector from a type 1 Pareto distribution with shape equal to 1 and minimum value equal to 500. We use the dpareto1() (actuar) function with option log = TRUE to write the log likelihood. Then we just need to use optimize() with maximum=TRUE. We provide a minimum and a maximum value for the parameter with the interval option.

> library(actuar)
> y <- rpareto1(1000, shape = 1, min = 500)
> ll <- function(mu, x) { 
+    sum(dpareto1(x,mu[1],min = min(x),log = TRUE)) 
+   } 
> optimize(f = ll, x = y, interval = c(0,10), maximum = TRUE)


Multiple dimensionEdit

  • fitdistr() (MASS package) fits univariate distributions by maximum likelihood. It is a wrapper for optim().
  • If you need to program yourself your maximum likelihood estimator (MLE) you have to use a built-in optimizer such as nlm(), optim(). R also includes the following optimizers :
  • mle() in the stats4 package
  • The maxLik package


Example with a logistic distributionEdit

For instance, we draw from a logistic distribution and we estimate the parameters using .

> # draw from a gumbel distribution using the inverse cdf simulation method
> e.1 <- -log(-log(runif(10000,0,1))) 
> e.2 <- -log(-log(runif(10000,0,1)))
> u <- e.2 - e.1  # u follows a logistic distribution (difference between two gumbels.)
> fitdistr(u,densfun=dlogis,start=list(location=0,scale=1))

Example with a Cauchy distributionEdit

For instance, we can write a simple maximum likelihood estimator for a Cauchy distribution using the nlm() optimizer. We first draw a vector x from a Cauchy distribution. Then we define the log likelihood function and then we optimize using the nlm() function. Note that nlm() is minimizer and not a maximizer.

> n <- 100
> x <- rcauchy(n)
> mlog.1 <- function(mu, x) { 
+   - sum(dcauchy(x, location = mu, log = TRUE)) 
+   } 
> mu.start <- median(x)
> out <- nlm(mlog.1, mu.start, x = x)


Example with a beta distributionEdit

Here is an other example with the Beta distribution and the optim() function.

> y <- rbeta(1000,2,2)
> loglik <- function(mu, x) { 
+    sum(-dbeta(x,mu[1],mu[2],log = TRUE)) 
+    } 
> 
> out <- optim(par = c(1,1), fn=loglik,x=y,method = "L-BFGS-B",lower=c(0,0))

TestsEdit

Likelihood Ratio TestEdit

  • lrtest() in the lmtest package[1].


Some Specific casesEdit

  • gum.fit() (ismev package) provides MLE for a Gumbel distributon


ResourcesEdit

ReferencesEdit

  1. Achim Zeileis, Torsten Hothorn (2002). Diagnostic Checking in Regression Relationships. R News 2(3), 7-10. URL http://CRAN.R-project.org/doc/Rnews/
Previous: Linear Models Index Next: Bayesian Methods