Validity (General)

Roc&amp;#237;o Fern&amp;#225;ndez-Ballesteros

doi:10.4135/9780857025753

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Validity (General)

Edited by:
Rocío Fernández-Ballesteros
In:Encyclopedia of Psychological Assessment
Chapter DOI:https://doi.org/10.4135/9780857025753.n225
Subject:Assessment, Psychometrics
Keywords:inferences; testing; validity

Request Permissions

Show page numbers Hide page numbers

Introduction

Tests and other forms of assessment are designed to provide information that will be useful for some purpose. The degree to which the information provided by a test score is useful, appropriate, and accurate is described by the psychometric concept validity. Validity is the extent to which the inferences (interpretations) derived from test scores are justifiable from both scientific and equity perspectives. For decisions based on test scores to be valid, the use of a test for a particular purpose must be supported by theory and empirical evidence, and biases in the measurement process must be ruled out.

Validity is not an intrinsic property of a test. As many psychometricians have pointed out (e.g. Cronbach, 1971; Messick, 1989; Shepard, 1993), in judging the worth of a test, it is the inferences derived from the test scores that must be validated, not the test itself. Therefore, the specific purpose(s) for which test scores are being used must be considered when evaluating validity. For example, a test may be useful for one purpose, such as patient diagnosis, but not for another, such as evaluating the treatment of patients.

Contemporary definitions of validity in testing borrow largely from Messick (1989) who stated ‘validity is an integrated evaluative judgement of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment’ (p. 13). From this definition, it is clear that validity is not something that can be established by a single study and that tests cannot be labelled ‘valid’ or ‘invalid’. Given that (a) validity is the most important consideration in evaluating the use of a test for a particular purpose, and (b) such utility can never be unequivocally established, establishing that a test is appropriate for a particular purpose is an arduous task. In the remainder of this entry, specific forms of evidence for validity as well as some validation frameworks will be discussed. Before describing these concepts and practices, the following facts about validity in testing should be clear: (a) tests must be evaluated with respect to a particular purpose, (b) what needs to be validated are the inferences derived from test scores, not the test itself, (c) evaluating inferences made from test scores involves several different types of qualitative and quantitative evidence, and (d) evaluating the validity of inferences derived from test scores is not a one-time event; it is a continuous process. In addition, it should be noted that although test developers must provide evidence to support the validity of the interpretations that are likely to be made from test scores, ultimately it is the responsibility of the users of a test to evaluate this evidence to ensure the test is appropriate for the purpose(s) for which it is being used.

Test Validation

To make the task of validating inferences derived from test scores both scientifically sound and manageable, Kane (1992) proposed an ‘argument-based approach to validity’. In this approach, the validator builds an argument based on empirical evidence to support the use of a test for a particular purpose. Although this validation framework acknowledges that validity can never be established absolutely, it requires evidence that (a) the test measures what it claims to measure, (b) the test scores display adequate reliability, and (c) test scores display relationships with other variables in a manner congruent with its predicted properties. Kane's practical perspective is congruent with the Standards for Educational and Psychological Testing (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on Measurement in Education [NCME], 1999), which provide detailed guidance regarding the types of evidence that should be brought forward to support the use of a test for a particular purpose. For example, the Standards

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Validity (General)

Introduction

Test Validation

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends