Skip to main content icon/video/no-internet

KR-20 (Kuder-Richardson Formula 20) is an index of the internal consistency reliability of a measurement instrument, such as a test, questionnaire, or inventory. Although it can be applied to any test item responses that are dichotomously scored, it is most often used in classical psychometric analysis of psycho educational tests and, as such, is discussed with this perspective.

Values of KR-20 generally range from 0.0 to 1.0, with higher values representing a more internally consistent instrument. In very rare cases, typically with very small samples, values less than 0.0 can occur, which indicates an extremely unreliable measurement. A rule-of-thumb commonly applied in practice is that 0.7 is an acceptable value or 0.8 for longer tests of 50 items or more. Squaring KR-20 provides an estimate of the proportion of score variance not resulting from error. Measurements with KR-20 < 0.7 have the majority of score variance resulting from error, which is unacceptable in most situations.

Internal consistency reliability is defined as the consistency, repeatability, or homogeneity of measurement given a set of item responses. Several approaches to reliability exist, and the approach relevant to a specific application depends on the sources of error that are of interest, with internal consistency being appropriate for error resulting from differing items.

KR-20 is calculated as

None

where K is the number of i items or observations, pi is the proportion of responses in the keyed direction for item i, qi = 1 − pi, and σ2x is the variance of the raw summed scores. Therefore, KR-20 is a function of the number of items, item difficulty, and the variance of examinee raw scores. It is also a function of the item-total correlations (classical discrimination statistics) and increases as the average item-total correlation increases.

KR-20 produces results equivalent to coefficient α which is another index of internal consistency, and can be considered a special case of α KR-20 can be calculated only on dichotomous data, where each item in the measurement instrument is cored into only two categories. Examples of this include true/false, correct/incorrect, yes/no, and present/absent. Coefficient α also can be calculated on polytomous data, that is, data with more than two levels. A common example of polytomous data is a Likert-type rating scale.

Like α KR-20 can be described as the mean of all possible split-half reliability coefficients based on the Flanagan-Rulon approach of split-half reliability. An additional interpretation is derived from Formula 1: The term piqi represents the variance of each item. If this is considered error variance, then the sum of the item variances divided by the total variance in scores presents the proportion of variance resulting from error. Subtracting this quantity from 1 translates it into the proportion of variance not resulting from error, assuming there is no source of error other than the random error present in the process of an examinee responding to each item.

G. Frederic Kuder and Marion Richardson also developed a simplification of KR-20 called KR-21, which assumes that the item difficulties are equivalent. KR-21 allows us to substitute the mean of the pi and qi into Formula 1 for pi and qi, which simplifies the calculation of the reliability.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading