Reliability

Roc&amp;#237;o Fern&amp;#225;ndez-Ballesteros

doi:10.4135/9780857025753

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Reliability

Edited by:
Rocío Fernández-Ballesteros
In:Encyclopedia of Psychological Assessment
Chapter DOI:https://doi.org/10.4135/9780857025753.n171
Subject:Assessment, Psychometrics
Keywords:true scores

Request Permissions

Show page numbers Hide page numbers

Introduction

Reliability as a central concept of test theory dates back to the beginning of the 20th century. It is based on the existence of intra-individual variability as well as variation between persons. With intra-individual variability or measurement error, true score was also introduced as a central concept of classical test theory. Observed score variance could then be thought of as true score variance plus error variance. The reliability of a test, rating scale, assessment or any other more or less standardized procedure within a given (sub)population of persons (or other objects of measurements, e.g. classrooms) is defined as the ratio of true score variance to observed score variance or as the squared correlation between observed scores and true scores (Lord & Novick, 1968: 61):

Its minimum value is zero, its maximum value one. As will be demonstrated, the definition is not very useful until we have defined precisely what we mean by ‘error’.

After the 1960s, Item Response Theory, IRT for short, became an influential approach in test theory. With IRT person parameters on a latent scale replace true scores. At first sight, there seems to be no place for reliability within the context of IRT. It can be demonstrated, however, that reliability is an important concept in the newer test theoretical approach also.

Reliability and Sources of Variation

When the length of a person is measured repeatedly, we notice small differences in the reading of the length: there is error in the measurements. The same is the case in measuring a person's characteristics in psychological testing. When an intelligence test would be administered to a person repeatedly, we would expect scores to vary: again there is measurement error. Unfortunately, the experiment of repeatedly testing a person with the same measurement instrument is seldom done; in practice we should expect memory effects. Instead, we could administer two tests meant to measure the same construct. Then a score difference might not only be due to chance fluctuations in item responses, but also to differences in content. Many more sources of variation can be thought of; for example, systematic fluctuation of responses over time. Sources of variance due to person characteristics can be classified as lasting or temporary, and lasting or specific. Further, there are factors affecting test administration and there is a category for variance not accounted for [Page 808]otherwise. Most of the sources of variation in responses might be regarded as a source of error variation, but the same sources might be regarded as sources of true variation, depending on the purpose of the test administrator. Let us give an example, mentioned by Stanley (1971: 366), who discusses the subject of sources of variation extensively. A person may be fatigued on the day of testing and this influences test performance. When our interest is to predict performances over some period, reliability would be consistency over time. When the intercorrelations among tests administered at the same session are studied, consistency at that session is relevant. So, the definition of error depends on the purpose of the investigator, and this should determine the choice of reliability coefficient(s).

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Reliability

Introduction

Reliability and Sources of Variation

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends