Maximum Likelihood Estimation Methods

Michael W.Kattan

doi:10.4135/9781412971980

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Maximum Likelihood Estimation Methods

Edited by:
Michael W. Kattan
In:Encyclopedia of Medical Decision Making
Chapter DOI:https://doi.org/10.4135/9781412971980.n213
Subject:Medical Decision Making
Keywords:likelihood functions; probability

Request Permissions

Show page numbers Hide page numbers

In medical decision making, statistical modeling plays a prominent role. The likelihood theory provides a generally applicable method to estimate and test parameters in a statistical model. The method goes back to one of the most famous statisticians from history, Sir Ronald A. Fisher, who worked on the method between 1912 and 1922. Most statistical methods make use of likelihood methods. This entry begins by explaining the likelihood function and then maximum likelihood estimation. Next, it discusses the properties of maximum likelihood estimation. This entry closes with a discussion of the application of likelihood methods for testing a null hypothesis.

The Likelihood Function

Consider a random sample of size n from a population where on each individual in the sample the value of an outcome variable Y is observed. Suppose a statistical model is available that specifies the distribution of Y up to an unknown parameter θ, which can be a single parameter or a vector of more parameters. If the outcome variable, Y, is discrete, its distribution is specified by the probability function, which gives the probability of each possible outcome value, y, given the parameter(s) θ. If Y is continuous, its distribution is described by the probability density function, which is a function such that the probability of Y taking a value between a and b corresponds with the area under its graph between a and b. The probability function or probability density function of Y is denoted by f(y|θ). It might depend on other observed variables X (“covariates”) such as sex, age, and so on, but this dependence is suppressed in the notation. The observations are denoted by y1, y2, …, yn. The probability (density) function of one observation, say from individual i, is f(yi|θ). The simultaneous probability of all observations in the sample is the product of f(yi|θ) over all individuals in the sample. Given the observations, this is a function of the unknown parameter (s) and is called the likelihood function, L(θ):

Example

Suppose one is interested in the unknown prevalence, θ, of type II diabetes in a certain population with age above 65 years. To estimate this prevalence, n individuals are randomly drawn (with replacement) from the population, and outcome Y is observed, Y = 1 if the individual in the sample has type II diabetes and Y = 0 if not. The probability of a random individual having the disease is θ, so his contribution to the likelihood function is θ. The probability of a random individual not having the disease is 1 – θ, so the contribution of a healthy individual to the likelihood function is 1 – θ. Suppose m individuals with the disease are observed in the sample. Then the likelihood function is

Thus, if the sample size is n = 300, and 21 individuals with type II diabetes are observed, the likelihood function is