Skip to main content icon/video/no-internet

Item response theory (IRT) is an approach used for survey development, evaluation, and scoring. IRT models describe the relationship between a person's response to a survey question and his or her standing on a latent (i.e. unobservable) construct (e.g. math ability, depression severity, or fatigue level) being measured by multiple survey items. IRT modeling is used to (a) evaluate the psychometric properties of a survey, (b) test for measurement equivalence in responses to surveys administered across diverse populations, (c) link two or more surveys measuring similar domains on a common metric, and (d) develop tailored questionnaires that estimate a person's standing on a construct with the fewest number of questions. This entry discusses IRT model basics, the application of IRT to survey research, and obstacles to the widespread application of IRT.

IRT Model Basics

Item Response Curves

IRT models describe for each item in a scale how the item performs for measuring different levels of the measured construct. For example, the item/don't seem to care what happens to me would have IRT properties reflecting it is informative for measuring people with severe levels of depression, and an item such as/am happy most of the time would have IRT properties reflecting it is informative for measuring people with low levels of depression.

The probabilistic relationship between a person's response to an item and the latent variable (9) is expressed by item response curves (also referred to as category response curves or item trace lines). For example, Figure 1 presents the IRT response curves for the item/am unhappy some of the time, which has two responses, “false” and “true,” and is part of a scale measuring depression.

Individuals with little depression are located on the left side of the θ continuum in Figure 1, and people with severe depression are located on the right side of the axis. The vertical axis in Figure 1 indicates the probability that a person will select one of the item's response categories. Thus, the two response curves in Figure 1 indicate that the probability of responding “false” or “true” to the item I am unhappy some of the time depends on the respondent's depression level.

Figure 1 Item response curves representing the probability of a “false” or “true” response to the item I am unhappy some of the time conditional on a person's depression level. the threshold (b = 0.25) indicates the level of depression g(0) needed for a person to have a 50% probability for responding “false” or “true.”

None

Note: Numbers on the θ-axis are expressed in standardized units and, for the illustrations in this discussion, the mean depression level of the study population is set at 0 and the standard deviation is set to 1. Thus, a depression score equal to (None = 2.0) indicates that a person is 2 standard deviations above the population mean and is highly depressed.

The response curves in Figure 1 are represented by logistic curves that model the probability ρ that a person will respond “true” to this item (i) as a function of a

None

respondent's depression level θ, the relationship (a) of the item to the measured construct, and the severity or threshold (b) of the item on the θ scale. In IRT, a and b are referred to as item discrimination and threshold parameters, respectively.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading