Skip to main content icon/video/no-internet

Poisson and Negative Binomial Regression

The response variable in medical data is often in the form of counts. Examples include visits to the doctor, cases of stroke or heart attacks, and number of deaths due to various causes. Two common distributions used to model counts that arise in situations such as these are the Poisson and negative binomial distributions. In this entry, the important aspects of Poisson and negative binomial regression are covered along with an example to illustrate basic inference for these models.

Perhaps the most common realization of Poisson data is that of “rare-event” data, which are events that occur relatively few times in a large population. In this case, the Poisson distribution is seen as a limiting form of the binomial distribution when the sample size, n, grows large and the probability of an event occurring, π, grows small. Generally, this assumption holds reasonably well for n > 20 and π < .1. In medical and epidemiological literature, an example of this would be the number of cancer deaths in an at-risk group.

A second and obviously related realization of Poisson data is that of discrete count events in time or space. Events are typically considered as “arrivals” or discretized points over a continuous domain; for example, it may be of interest to count the number of doctor visits for an individual or family over a period of time. Furthermore, one may count the number of white blood cells per unit volume of a blood culture.

The probability that a Poisson random variable Y takes the observed value y is expressed in the probability mass function

None

In this parameterization, λ is described as an intensity or rate parameter and is often interpreted as the expected number of events in the rare-events paradigm or as the rate of events per unit time/space in the spatiotemporal paradigm. The value e is Euler's constant, and the denominator y! is the factorial function performed on the integer y, where y! = (y)(y − 1) … (2)(1). Thus, for gamma-distributed Poisson with

None
Typically, it is of interest to make some inference about the value of the unknown parameter λ. When subjects are followed over varying periods of time, the distribution is often parameterized as μ = dλ, where d is the amount of follow-up time, and is often referred to as the offset.

Poisson Regression Model

Poisson regression is one example of a broader class of models known as the generalized linear model. The generalized linear model includes ordinary least squares regression with normal errors, logistic regression, beta regression, and others. For Poisson regression, it is assumed that the value of the mean depends on a function of an observed vector of covariates, xi = (x1x2 …, xp), and model parameters, β′ = (β0, β1, …, βp). Since the Poisson rate, λi, is strictly nonnegative, the expected number of events is usually modeled as

None

If there is an offset, the mean is modeled as

None

In the generalized linear model terminology, the exponential function connecting the expected value and the covariates is referred to the log link because if the log of the mean function is taken, a linear combination of the regression parameters results.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading