Skip to main content icon/video/no-internet

Definition and Features

A criterion-referenced test (CRT) provides a measure of an individual's absolute performance or behavior on a well-defined domain. The domain may include a set of learning/behavioral objectives to be mastered or a set of standards to be achieved. The history of CRTs dates back to a 1963 essay by Robert Glaser in which he introduced criterion referencing as a new type of approach to test development and interpretation. Glaser indicated that the absolute comparisons that formed the basis of CRTs were preferable to the relative comparisons made by the norm-referenced tests (NRTs) widely used at that time.

One way to gain a better understanding of exactly what a CRT entails is to compare it to an NRT. Within this section, CRTs are further defined and comparisons to NRTs are made. The next section discusses the development, use, and interpretation of CRTs in large-scale tests used for accountability purposes. It includes issues of alignment and validity, performance levels, and standard settings. The third section discusses the use of CRTs for classroom instructional purposes. The final section briefly describes guidelines available to assist test users in selecting and interpreting CRTs.

What does a CRT Measure?

Whereas NRTs compare an individual's performance to the performance of others in a comparison group, such as all students in a state or across the nation, CRTs compare an individual's performance to standards or learning objectives. Because of this distinction, interpretations of NRTs are often referred to as relative comparisons and CRTs as absolute comparisons. In a CRT, test users are concerned about knowing whether the student has achieved a standard or mastered a learning objective. They are not concerned about the position or ranking of the student relative to other students.

What is the Domain for a CRT?

CRTs consist of a well-defined domain of knowledge, skills, and/or behaviors to be measured. The domain for a CRT is narrower and more clearly delineated than the domain for an NRT, which is usually broader and covers many objectives. The level of specificity, however, varies somewhat across different types of CRTs. In some classroom situations, CRTs are called objective-referenced because they measure detailed learning outcomes (e.g., adding two three-digit numbers that require carrying). In large-scale tests that are standards-based, the criteria may be slightly less detailed (e.g., measuring whether a student can represent mathematical situations using algebraic symbols).

How Are the Items Designed?

In NRTs, items are usually developed so that the difficulty level is average and discrimination among student scores is high. Because easy items do not lend themselves to discriminating among individuals, they are not usually used on NRTs. However, for CRTs, the item difficulty or discrimination is not of utmost importance. Rather, the most critical aspect of developing a CRT is that each item must have a direct match to a learning objective, behavior, or standard within the domain. Depending on the purpose of administering the CRT, the items may be very easy or difficult. When a test is given immediately after instruction to determine whether students attained the knowledge necessary to move on to the next topic, it is likely that the items will have a low difficulty level. Also, there would be no concern about whether the test was able to discriminate among students.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading