Skip to main content icon/video/no-internet

In statistics, discrimination is the ability of a prediction (judgment scheme, statistical model, etc.) to distinguish between events and nonevents (or cases from controls, successes from failures, disease from nondisease, etc.). In the simplest form, a prediction scheme focuses on a single event with two possible states and assigns some estimate of the chance that one state will occur. This prediction comes from the set of cues and other factors, both measurable and immeasurable, available to the researcher.

Whether it is meteorologists forecasting the weather, business analysts predicting the rise and fall of the stock market, bookmakers predicting the big game, or physicians diagnosing disease, predictions have some degree of “correctness” relative to the actual occurrence of some unknown or future event. In medicine, we commonly predict the presence or absence of disease (diagnosis) or the likelihood of development of disease progression (prognosis). Measures have arisen to gauge the quality of a given set of predictions and to quantify prediction accuracy.

Multiple methods for forming these predictions exist, and each has associated strengths and weaknesses. One aspect is the “difficulty” that is set by nature. Outcome index variance is a measure of this difficulty. In addition, calibration addresses the relationship of the subgroup-specific predictions to the subgroup-specific observed event rate. The part of prediction accuracy that is often of highest interest is discrimination. The task of discrimination is to determine with some degree of certainty when the event will or will not occur. It measures the degree to which the prediction scheme separates events from nonevents. Discrimination is therefore influenced by variation in the predictions within the event/nonevent groups. Discrimination strength is related to the degree to which a prediction scheme assigns events and nonevents different probabilities—in other words, how well a scheme separates events into distinct “bins” (e.g., alive vs. dead or first vs. second vs. third). The sole focus of discrimination is this ability to place different events into different categories. The labels placed on those categories are somewhat arbitrary.

“Perfect” discrimination will occur when each appropriate category contains 100% or 0% of events. Perfect nondiscrimination, or nil discrimination, occurs when the group-specific event rate is the same as the overall percentage of events (also called the prevalence or mean base rate). In this case, the prediction scheme is no better than chance, and the groups are essentially assigned at random. One can, however, do worse than this by predicting groups in the wrong direction. However, this is, in a sense, still better than nil discrimination, but it is classifying groups incorrectly. Any discrimination that is better than the overall event prevalence improves the discrimination. In this case, simply reversing the labels of event and non-event associated with the predictions can improve the discrimination.

Discrimination Types

Discrimination can be thought of in three distinct ways, each of use in different situations. These three types of discrimination arise from thinking of discrimination like types of data. Data can be nominal, having no order but simply labels (e.g., color or gender). Ordinal data have associated order (e.g., mild/moderate/severe) but no measurable distance between groups. Continuous data have a distance between two groups that can be measured. Discrimination can be thought of along a similar continuum: nominal, ordinal, and continuous.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading