Skip to main content icon/video/no-internet

The standard error of measurement is a statistic that indicates the variability of the errors of measurement by estimating the average number of points by which observed scores are away from true scores. To understand standard error of measurement, an introduction of basic concepts in reliability theory is necessary. Therefore, this entry first examines true scores and error of measurement and variance components. Next, this entry discusses the standard error of measurement and its uses. Last, this entry comments on methods for reducing measurement error.

True Score and Error of Measurement

When psychological tests are administered, points on items measuring the same construct are aggregated to generate the raw score for each respondent. The raw score, or observed score, can be conceptualized as the sum of two distinct components, the true score and the error of measurement. The true score is caused by systematic influence that produces consistency to the observed score on the test. Note that the true score does not speak to the true standing of an individual on the construct; instead, it only indicates the unchanging portion of the observed score. The error of measurement is a result of chance influence that produces random variation to the observed score on the test.

None

The concepts of the true score and the error of measurement can be illustrated with the next example. Consider an examinee who takes an ability test and assume that the same test can be administered multiple times without him or her remembering the test items. The examinee's observed score might be influenced by factors such as his or her true ability level, test-taking skills, lucky guessing, and fatigue. The portion of the observed score contributed by the ability level and the test-taking skills is invariant across different possible administrations of the test, as long as his or her ability and test-taking skills remain unchanged. This consistent portion of the observed score is defined as the examinee's true score. Note that the true score might be influenced by his or her level of the construct, that is, his or her ability level, as well as factors that are irrelevant to his or her level of the construct, such as his or her test-taking skills.

In contrast, fatigue and lucky guess might vary on different administrations of the test, so their effects are transient rather than systematic. The portion of the observed score associated with these transient factors is the error of measurement. The error of measurement is random, being positive sometimes and negative other times, and when the same test is taken by the examinee for a large number of times, the average of the error will approach zero. In other words, the average of those observed scores will approximate the true score.

Variance Components

The amount of variation in observed scores on a test from a population of interest can also be partitioned into two components: variation caused by true scores and variation caused by an error of measurement. When expressed in terms of variance components:

None

The reliability of a test is defined as the extent to which the observed score variance is caused by true score variance. With the same amount of observed score variance, the larger the true score variance, or the smaller the error variance, the more reliable a test is. Reliability can be expressed as

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading