# R Programming/Maximum Likelihood

## IntroductionEdit

Maximum likelihood estimation is just an optimization problem. You have to write down your log likelihood function and use some optimization technique. Sometimes you also need to write your score (the first derivative of the log likelihood) and or the hessian (the second derivative of the log likelihood).

## One dimensionEdit

If there is only one parameter, we can optimize the log likelihood using `optimize()`.

### Example with a type 1 Pareto distributionEdit

We provide an example with a type 1 Pareto distribution. Note that in this example we treat the minimum as known and do not estimate it. Therefore this is a one-dimensional problem.

We use the `rpareto1()` (actuar) function to generate a random vector from a type 1 Pareto distribution with shape equal to 1 and minimum value equal to 500. We use the `dpareto1()` (actuar) function with option `log = TRUE` to write the log likelihood. Then we just need to use `optimize()` with `maximum=TRUE`. We provide a minimum and a maximum value for the parameter with the `interval` option.

```> library(actuar)
> y <- rpareto1(1000, shape = 1, min = 500)
> ll <- function(mu, x) {
+    sum(dpareto1(x,mu[1],min = min(x),log = TRUE))
+   }
> optimize(f = ll, x = y, interval = c(0,10), maximum = TRUE)
```

## Multiple dimensionEdit

• fitdistr() (MASS package) fits univariate distributions by maximum likelihood. It is a wrapper for `optim()`.
• If you need to program yourself your maximum likelihood estimator (MLE) you have to use a built-in optimizer such as nlm(), optim(). R also includes the following optimizers :
• mle() in the stats4 package
• The maxLik package

### Example with a logistic distributionEdit

For instance, we draw from a logistic distribution and we estimate the parameters using .

```> # draw from a gumbel distribution using the inverse cdf simulation method
> e.1 <- -log(-log(runif(10000,0,1)))
> e.2 <- -log(-log(runif(10000,0,1)))
> u <- e.2 - e.1  # u follows a logistic distribution (difference between two gumbels.)
> fitdistr(u,densfun=dlogis,start=list(location=0,scale=1))
```

### Example with a Cauchy distributionEdit

For instance, we can write a simple maximum likelihood estimator for a Cauchy distribution using the nlm() optimizer. We first draw a vector x from a Cauchy distribution. Then we define the log likelihood function and then we optimize using the nlm() function. Note that nlm() is minimizer and not a maximizer.

```> n <- 100
> x <- rcauchy(n)
> mlog.1 <- function(mu, x) {
+   - sum(dcauchy(x, location = mu, log = TRUE))
+   }
> mu.start <- median(x)
> out <- nlm(mlog.1, mu.start, x = x)
```

### Example with a beta distributionEdit

Here is an other example with the Beta distribution and the optim() function.

```> y <- rbeta(1000,2,2)
> loglik <- function(mu, x) {
+    sum(-dbeta(x,mu[1],mu[2],log = TRUE))
+    }
>
> out <- optim(par = c(1,1), fn=loglik,x=y,method = "L-BFGS-B",lower=c(0,0))
```

## TestsEdit

### Likelihood Ratio TestEdit

• lrtest() in the lmtest package[1].

## Some Specific casesEdit

• gum.fit() (ismev package) provides MLE for a Gumbel distributon

## ReferencesEdit

1. Achim Zeileis, Torsten Hothorn (2002). Diagnostic Checking in Regression Relationships. R News 2(3), 7-10. URL http://CRAN.R-project.org/doc/Rnews/