Skip to main content icon/video/no-internet

In order for the scores produced by a measure (referred to hereinafter as scores) to prove useful for the purpose of basic or applied research, it is critical that the measure be reliable. Reliability can be viewed from the perspective of systematic versus nonsystematic sources of variability in scores. To the degree that scores are systematic, they lead to measurement that is precise. Thus, if the level of an attribute (e.g., verbal ability) of a measured entity (e.g., person) remains unchanged, then repeated measurement of the attribute should produce scores that do not vary from one measurement occasion to the next. The greater the degree to which the variability in scores is a function of systematic variance, the more reliable the measure.

Scores are nonsystematic to the extent that they contain error variance that is random in nature. The greater the degree of error variance, the more scores will vary from one measurement occasion to the next. Error in the scores will cause them to vary across occasions even though the level of the measured attribute remains constant.

Ways of Viewing the Reliability of a Measure

There are two basic ways of characterizing the reliability of a measure. One is the reliability coefficientxx), which varies between 0 (totally unreliable) and 1 (totally reliable).

Conceptually, it is the proportion of variance in scores that is systematic. It can be estimated through three general strategies. The test-retest strategy requires that a number (N) of entities be measured on two occasions with the same k-item measure. The alternative forms strategy involves measuring the N entities at approximately the same time using two forms of a measure that are designed to assess the same underlying construct but that have different items. The internal consistency method relies on measuring the N entities on a single occasion with a measure having multiple (k) items.

The reliability of a measure can also be characterized in terms of the standard error of measurement, denoted herein as σmeas, which should not be confused with the standard error of the mean (σM). Assuming that an attribute of an entity was measured a large number of times with a specific measure of the variable x and that its level remained constant over time, the σmeas would be the standard deviation of scores produced by the measure. Alternatively, the σmeas can be thought of in terms of the standard deviation of scores resulting from measuring a given entity with j measures of x that meet the classical test theory requirements of being mutually parallel. In practice, the estimation of σmeas requires neither (a) the repeated measurement of the entity with a single measure of x nor (b) the measurement of the entity with j parallel measures of x.

Assuming the measurement of x across N measured entities, an estimate of σmeas (denoted hereinafter as σ¯meas) can be obtained using estimates of the reliability of the measure (rxx) and the standard deviation of scores produced by the measure (sx):

None

For example, assume that a researcher measured the quantitative ability of 30 individuals using a 25-item test and that the resulting scores had a mean of 18, standard deviation of 5, and an internal consistency-based estimate of reliability of .80. In this case, σ¯meas would

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading