Skip to main content icon/video/no-internet

Item-Test Correlation

The item-test correlation is the Pearson correlation coefficient calculated for pairs of scores where one item of each pair is an item score and the other item is the total test score. The greater the value of the coefficient, the stronger is the correlation between the item and the total test. Test developers strive to select items for a test that have a high correlation with the total score to ensure that the test is internally consistent. Because the item-test correlation is often used to support the contention that the item is a “good” contributor to what the test measures, it has sometimes been called an index of item validity. That term applies only to a type of evidence called internal structure validity, which is synonymous with internal consistency reliability. Because the item-test correlation is clearly an index of internal consistency, it should be considered as a measure of item functioning associated with that type of reliability. The item-test correlation is one of many item discrimination indices used in item analysis.

Because item responses are typically scored as zero when incorrect and unity (one) if correct, the item variable is binary or dichotomous (having two values). The resulting correlation is properly called a point-biserial coefficient when a binary item is correlated with a total score that has more than two values (called polytomous or continuous). However, some items, especially essay items, performance assessments, or those for inclusion in affective scales, are not usually dichotomous, and thus some item-test correlations are regular Pearson coefficients between polytomous items and total scores. The magnitude of correlations found when using polytomous items is usually greater than that observed for dichotomous items. Reliability is related to the magnitude of the correlations and to the number of items in a test, and thus with polytomous items, a lesser number of items is usually sufficient to produce a given level of reliability. Similarly, to the extent that the average of the item-test correlations for a set of items is increased, the number of items needed for a reliable test is reduced.

All correlations tend to be higher in groups that have a wide range of talent than in groups where there is a more restricted range. In that respect, the item-test correlation presents information about the group as well as about the item and the test. The range of talent in the group might be limited in some samples, for example, in a group of students who have all passed prerequisites for an advanced class. In groups where a restriction of range exists, the item-test correlations will provide a lower estimate of the relationship between the item and the test.

When the range of talent in the group being tested is not restricted, the item-test correlation is a spurious measure of item quality. The spurious-ness arises from the inclusion of the particular item in the total test score, resulting in the correlation between an item and itself being added to the correlation between the item and the rest of the total test score. A preferred concept might be the itemrest correlation, which is the correlation between the item and the sum of the rest of the item scores. Another term for this item-rest correlation is the corrected item-test correlation, the name given to this type of index in the SPSS Scale Reliability analysis (SPSS, an IBM company).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading