Skip to main content icon/video/no-internet

Validity is the meaning, or value, of assessment results or test scores. Whereas reliability refers to the precision of a test or assessment outcome, validity refers to the meaning of the test or assessment out-come. Historically, school psychologists have considered tests to have three forms of validity: content, construct, and criterion (American Education Research Association, American Psychological Association, & National Council on Measurement in Education, 1985). Content validity (sometimes known as face validity) refers to examination of test content to establish its meaning. For example, the validity of most academic achievement tests is determined by their content (e.g., mathematics items create a mathematics test). Construct validity refers to the relationship the test has with other variables as an approach to establish meaning. For example, the fact that raw scores on intelligence tests increase with age suggests that the test measures a phenomenon (e.g., cognitive ability) that increases with age. Criterion validity refers to the relationship of the test to other socially valued criteria. For example, the relationship between college entrance exams and college students’ grade point averages provides evidence of criterion-related validity.

However, recent advances in research and practice, reflected in the current edition of the joint testing standards (American Education Research Association, American Psychological Association, & National Council on Measurement in Education, 1999), provide broader standards to guide school psychologists’ evaluation of test and assessment validity. The current Standards replace three forms of validity with the notion that validity is a unitary construct. Meaning, or validity, of a test or assessment should be judged according to five sources of evidence, which are:

  • Content evidence is the same as content validity discussed in the preceding paragraph.
  • Response processes refers to evidence showing that examinees use targeted psychological processes (e.g., specific cognitive abilities, emotional traits) when responding to a test or assessment, and not other, unintended processes that compromise the meaning of test results. For example, dynamic magnetic resonance imaging (MRI) showing increased neural activity in the brain's frontal lobes while completing a puzzle suggests the puzzle item taps the planning processes.
  • Internal structure refers to the way components of a test or assessment relate to other components. For example, intelligence test batteries present factor analytic evidence to show tests purporting to measure the same trait (e.g., fluid reasoning) are more related to each other than to other tests.
  • Relations to other variables refers to the relationship of the test to measures that are not part of the test. This includes construct and criterion evidence mentioned in the previous paragraph (e.g., the relationship between raw test scores and age, or test scores and performance, in school or work settings).
  • Test consequences refers to how the test brings about intended (e.g., improved planning for educational or psychological interventions) and unintended (e.g., diminished expectations because of labeling) consequences for test takers.

Test developers should provide evidence in the five domains that is relevant to the claims they make for the test, and test users (e.g., school psychologists) should review this evidence to evaluate the degree to which test scores mean what developers say they mean. Careful examination of test claims and supporting evidence suggests claims often exceed the evidence offered to support them. For example, developers of the intelligence tests generally provide strong evidence supporting test content, internal structure, and relations to other variables, yet they do not provide evidence relevant to response processes or test consequences (Braden & Niebling, in press). Intelligence test critics note these tests lack “treatment utility,” or evidence that the test results improve interventions and outcomes for students. However, consequential validity evidence is limited for all forms of school psychology assessment, not just intelligence tests. The new standards challenge test developers and test users (e.g., school psychologists) to expand the breadth and depth of evidence that define the validity, or meaning, of test scores and assessment results.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading