Split-Half Reliability

Neil J.Salkind

doi:10.4135/9781412961288

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Split-Half Reliability

Edited by:
Neil J. Salkind
In:Encyclopedia of Research Design
Chapter DOI:https://doi.org/10.4135/9781412961288.n431
Subject:Research Design
Keywords:measurement; split-half reliability

Request Permissions

Show page numbers Hide page numbers

Measurement is fundamental to almost all forms of research and applied science. To conduct quantitative research, scientists must measure at least one variable. For example, researchers studying the effect of social rejection on self-esteem must measure participants’ self-esteem in some way. Similarly, to apply scientific knowledge, practitioners often rely heavily on measurement. For example, school psychologists measure children's academic and cognitive aptitudes to place them in appropriate classes and to identify potential academic difficulties. Given the importance of measurement, researchers and practitioners must evaluate the quality of the measurement tools that they use. Reliability is a key facet of measurement quality, and split-half reliability is a method of estimating the reliability of a measurement instrument.

Reliability

Briefly stated, reliability reflects the precision of scores obtained from a measurement instrument—how closely participants’ scores on the instrument correspond to their real characteristics. Unfortunately, many factors can interfere with measurement in any scientific domain, some of which are unsystematic sources of measurement error. Such factors artificially inflate some participants’ scores and deflate others’ scores in a random, or unsystematic, way. In behavioral research, these factors can include guessing, poorly written items, fatigue, misreading test items, and temporary mood states.

Consider, for example, a participant in a study involving a measure of trait self-esteem (i.e., the degree to which a person sees himself or herself in a generally positive way). Imagine that the participant actually has a high level of trait self-esteem, generally having a positive view of himself or herself. Unfortunately, one or two of the self-esteem questionnaire's items are worded in a confusing manner (e.g., “I rarely feel as if I don't have low self-esteem”). Such items can elicit confused responses that do not reflect accurately the person's truly high level of self-esteem, thereby introducing error and imprecision into the measurement process. As an index of measurement precision, reliability reflects the degree to which test scores are free of unsystematic measurement error.

Reliability cannot be known directly, so it must be estimated. Much as a person's self-esteem is not directly observable and must be estimated from his or her test scores, reliability is not directly observable and must be estimated from a set of test scores. As a fundamental facet of reliability, measurement error cannot be known in reality—researchers cannot truly know the degree to which a respondent's scores are affected by fatigue, confusing wording, mood states, or any of the many factors potentially affecting test scores. Consequently, reliability must be estimated from the scores obtained on the measurement instrument itself. Split-half reliability is one of many approaches to estimating the reliability of scores on a measurement instrument.

Computing and Interpreting Split-Half Reliability

The split-half method of estimating reliability is most directly applicable to instruments that have multiple items. Indeed, many instruments in behavioral research are tests, questionnaires, [Page 1411]inventories, or surveys that include two or more items.

Table 1 Split-Half Reliability Example Data

Note: SD = standard deviation.

Consider the hypothetical set of responses in Table 1. Imagine that a researcher wishes to estimate the reliability of a four-item test of trait self-esteem, in which each item presents a statement relevant to self-esteem (e.g., “I often feel that I am a good person”). People respond to each item using a seven-point scale indicating their level of agreement with the statements (e.g., 1 = strongly disagree, 4 = neutral, and 7 = strongly agree)—thus, larger numbers reflect greater self-esteem. Peoples’ responses are summed to create a total score indicating their level of trait self-esteem. Of course, many good tests include negatively keyed items, for which an endorsement or agreement reflects a low level of the characteristic being measures (e.g., “I rarely feel like I'm a good person”). Such items must be reverse scored before scoring the scale and evaluating reliability. As shown in Table 1, Person 2 has the highest level of self-esteem and Person 4 has the lowest. Being aware that scores on the instrument might be affected by measurement error, the researcher estimates the reliability of these scores.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Split-Half Reliability

Reliability

Computing and Interpreting Split-Half Reliability

Table 1 Split-Half Reliability Example Data

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends