Dummy coding, also known as indicator coding, provides a means for researchers to represent a categorical variable as a set of independent quantitative variables. The resulting dummy variables take on values of 0 and 1 and can be used as predictors in regression analysis. Given a categorical variable that can take on k values, it is possible to create k − 1 dummy variables without any loss of information. Dummy variables are often included in regression models to estimate the effects of categorical variables such as race, marital status, diagnostic group, and treatment setting.

When constructing a set of dummy variables, one level of the original categorical variable is selected as a reference category and is excluded from analysis. Each remaining level becomes a single dummy ...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles