Reliability

M. David Miller

doi:10.4135/9781412963848

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Reliability

By: M. David Miller
In:Encyclopedia of Educational Psychology
Chapter DOI:https://doi.org/10.4135/9781412963848.n230
Subject:School/Educational Psychology (general), Educational Psychology, School Psychology
Keywords:classical test theory; generalizability theory; item response theory; testing

Request Permissions

Show page numbers Hide page numbers

The two most important properties of an assessment are its validity and reliability. Validity refers to the meaningfulness of the interpretations and uses of a test score and is the most important property of an assessment. Reliability refers to the extent to which test scores are free from errors of measurement. Thus, validity examines the interpretations and uses that can reasonably be made from the consistent part of the test scores, whereas reliability is concerned with inconsistent or random errors of measurement. As a result, reliability is a necessary but not sufficient [Page 847]condition for validity. That is, there needs to be some level of consistency to understand the meaningfulness of particular uses and interpretations of test scores, but measuring consistently does not guarantee the meaningfulness of the interpretations or uses.

Reliability and validity are not global properties of an assessment. Instead, they are properties of specific uses and interpretations that are made from a set of test scores. A test could be valid for a particular use or interpretation and not for another. For example, a test might measure the curriculum covered in a school without providing valid estimates of student performance because of the length of the tests or the nonequivalence of forms. The same is true for reliability. For example, a test might provide reliable scoring without being stable over time. In addition, reliability and validity are a matter of degree. Tests are not considered valid or invalid. Instead, they are valid to some degree. Similarly, a test is not considered reliable or unreliable, but is reliable to some degree.

Estimates of reliability are indices that quantify the amount of measurement error for a particular test use or interpretation for a specified population. Although reliability can be defined broadly in terms of consistency or generalizability, specific statistical indices of reliability will vary depending on the statistical model and the sources of error. The statistical model may be based on classical test theory, generalizability theory, or item response theory. Classical test theory and generalizability theory are based on total scores, whereas item response theory is based on an estimate of a latent trait. In this entry, only classical test theory and generalizability theory are considered. Within each theory, there are multiple indices of reliability based on multiple sources of measurement error, including item heterogeneity, equivalence of test forms, stability over time, and consistency of subjective ratings. Different sources of error would be of concern in different contexts. For example, the test score of a student writing an essay is affected by errors in scoring, whereas the test score from a student taking a multiple-choice test is affected by the heterogeneity of the items selected to measure the construct. In addition, a test score can be affected by multiple sources of error simultaneously. A student taking the GRE might be affected by the heterogeneity of the items, the form of the test, and the subjectivity of the scoring for the written portion of the test. Thus, there are many types of reliability that vary depending on the sources of error being considered as well as the statistical model or test theory being used. These varying definitions will be selected based on the particular test use or score interpretation being made, and one type of reliability should not be considered interchangeable with another.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Reliability

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends