Skip to main content icon/video/no-internet

Introduction

Educational and psychological testing has been undergoing major changes in recent years. Demands for new psychological measures, increased interest in diagnostic assessment, the influence of cognitive psychology on testing, introduction of new test item formats, and the role of computers in test administration, scoring, and score interpretations are five of many changes taking place in testing practices today. Less well known among psychologists is the fact that the basic psychometric theory for developing educational and psychological tests and evaluating tests and test scores is changing too and these changes should make the construction and evaluation of tests and the interpretation of test scores easier and potentially more valid.

Psychologists have seen occasional references to the Rasch model, the three-parameter logistic model, latent trait theory, item response theory, latent ability, item characteristic curves, computer adaptive testing, etc. in popular psychological testing texts, test manuals, and journals (see, for example, Anastasi, 1989). These new psychometric terms are associated with modern test theory, known as ‘item response theory’. The purposes of this entry are (1) to describe some of the shortcomings of classical test theory, models, and methods, (2) to introduce item response theory and related concepts and models, and (3) to identify some of the advantages of item response theory and associated methods for psychologists.

Shortcomings of Classical Test Theory and Methods

Classical test theory has provided the statistical underpinnings for both educational and psychological tests. While popular psychological testing books such as those of Thorndike and Hagen, Anastasi, and Cronbach do not provide the relevant theory and derivations, all of the popular measurement formulas and approaches for constructing tests, evaluating tests, and interpreting scores that appear in these books (e.g. Spearman-Brown formula, standard error of measurement, corrections for score range restrictions) are derived from the classical test model.

Despite the usefulness of classical test theory and models in psychometric methods, shortcomings in the basic theory underlying psychological testing and measurement procedures for test construction have been recognized for over 50 years (see Gulliksen, 1950). One such shortcoming is that classical item statistics-item difficulty and item discrimination – depend on the particular examinee samples from which they were obtained. A consequence of this dependence on a specific sample of examinees is that these item statistics are only useful when constructing tests for examinee populations that are similar to the sample of examinees from which the item statistics were obtained. Unfortunately, one cannot always be sure that the population of examinees for whom a test is intended is similar to the sample of examinees used in obtaining item statistics. Preferable would be statistics for test items which are independent of the particular sample of examinees in which they are obtained. ‘Invariant item statistics over samples’ is the goal.

Not only are popular classical item statistics used in test development samples dependent, but so are other important test statistics such as test reliability and validity. Test reliability is higher when estimated in heterogeneous samples of examinees rather than in more homogeneous samples of examinees. Correction factors are often used to adjust reliability estimates for this problem but the fact is that the dependence of reliability indices on the choice of examinee sample is troublesome. Again, test statistics independent of examinee samples would be valuable.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading