Skip to main content icon/video/no-internet

A norm-referenced test is one that is designed to facilitate interpretations of scores by comparing scores based on the ordering of examinees within a well-defined group of interest. Two important definitions associated with norm-referenced interpretations are percentiles and percentile ranks. A percentile is a test score below which falls a certain percentage of the scores. A percentile rank is the percentage of people who have a score lower than the score of interest.

Since about 1905, norm-referenced interpretations have been a dominant approach to making test scores meaningful to both educators and the public, although criterion-referenced interpretations have been gaining in prominence over the past several decades. A special type of criterion-referenced interpretation, known as standards-based interpretation, has been gaining in popularity since No Child Left Behind was enacted into federal law in 2002.

Score Interpretation

By itself, a raw score on a test has no meaning. Knowing an examinee answered 21 questions correctly on a math test is useless by itself. Faced with such useless information, one might ask any of a number of questions.

  • How many items were on the test?
  • What content within math was covered?
  • What related content was excluded?
  • What was the cognitive complexity of those items?
  • What item formats were used (multiple-choice, problem solving, proofs)?
  • How well did other examinees perform on this same test?
  • What do we know about other achievements of examinees with similar scores?

Answers to each of these questions will raise other questions. Over time, the meaning of test scores accrues as users become familiar with the characteristics of those scores and the relationships those scores have with variables of interest.

Test makers try to facilitate this development of meaning by creating score scales that support the intended primary inferences. One such approach is referred to as norm-referenced—the comparison of the performance of an examinee with the performance of other examinees in a meaningfully defined group. Such interpretations may be particularly useful when determining how to allocate insufficient resources, such as if there are more applicants for an educational program, school, university, or job than there are openings. For example, norm-referenced tests might be used as a significant piece of information in determining which students should be placed in a remedial or gifted program.

If resource allocation decisions were simple and there were 12 spots in a remedial program, one could simply admit the 12 students with the lowest scores. But most real-world resource allocation problems are more complex and somewhat elastic. Thus, developing expectations over time (and thus sometimes admitting more or fewer students into such a program) is facilitated by normative data. Therefore, policy-makers might often prefer for a program to be made available for any students in the bottom 10%, rather than for a fixed number of students.

Normative expectations can also serve to facilitate group comparisons, for example, whether or not students in a school or district are performing as a group better than those in other schools or districts. Whether or not such differences are interpreted correctly, they can influence the perceived desirability of neighborhoods and, thus, real estate prices.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading