Item Response Theory: Models and Features

Roc&amp;#237;o Fern&amp;#225;ndez-Ballesteros

doi:10.4135/9780857025753

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Item Response Theory: Models and Features

Edited by:
Rocío Fernández-Ballesteros
In:Encyclopedia of Psychological Assessment
Chapter DOI:https://doi.org/10.4135/9780857025753.n110
Subject:Assessment, Psychometrics
Keywords:classical test theory; item response theory; psychological tests; testing

Request Permissions

Show page numbers Hide page numbers

Introduction

Educational and psychological testing has been undergoing major changes in recent years. Demands for new psychological measures, increased interest in diagnostic assessment, the influence of cognitive psychology on testing, introduction of new test item formats, and the role of computers in test administration, scoring, and score interpretations are five of many changes taking place in testing practices today. Less well known among psychologists is the fact that the basic psychometric theory for developing educational and psychological tests and evaluating tests and test scores is changing too and these changes should make the construction and evaluation of tests and the interpretation of test scores easier and potentially more valid.

Psychologists have seen occasional references to the Rasch model, the three-parameter logistic model, latent trait theory, item response theory, [Page 510]latent ability, item characteristic curves, computer adaptive testing, etc. in popular psychological testing texts, test manuals, and journals (see, for example, Anastasi, 1989). These new psychometric terms are associated with modern test theory, known as ‘item response theory’. The purposes of this entry are (1) to describe some of the shortcomings of classical test theory, models, and methods, (2) to introduce item response theory and related concepts and models, and (3) to identify some of the advantages of item response theory and associated methods for psychologists.

Shortcomings of Classical Test Theory and Methods

Classical test theory has provided the statistical underpinnings for both educational and psychological tests. While popular psychological testing books such as those of Thorndike and Hagen, Anastasi, and Cronbach do not provide the relevant theory and derivations, all of the popular measurement formulas and approaches for constructing tests, evaluating tests, and interpreting scores that appear in these books (e.g. Spearman-Brown formula, standard error of measurement, corrections for score range restrictions) are derived from the classical test model.

Despite the usefulness of classical test theory and models in psychometric methods, shortcomings in the basic theory underlying psychological testing and measurement procedures for test construction have been recognized for over 50 years (see Gulliksen, 1950). One such shortcoming is that classical item statistics-item difficulty and item discrimination – depend on the particular examinee samples from which they were obtained. A consequence of this dependence on a specific sample of examinees is that these item statistics are only useful when constructing tests for examinee populations that are similar to the sample of examinees from which the item statistics were obtained. Unfortunately, one cannot always be sure that the population of examinees for whom a test is intended is similar to the sample of examinees used in obtaining item statistics. Preferable would be statistics for test items which are independent of the particular sample of examinees in which they are obtained. ‘Invariant item statistics over samples’ is the goal.

Not only are popular classical item statistics used in test development samples dependent, but so are other important test statistics such as test reliability and validity. Test reliability is higher when estimated in heterogeneous samples of examinees rather than in more homogeneous samples of examinees. Correction factors are often used to adjust reliability estimates for this problem but the fact is that the dependence of reliability indices on the choice of examinee sample is troublesome. Again, test statistics independent of examinee samples would be valuable.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Item Response Theory: Models and Features

Introduction

Shortcomings of Classical Test Theory and Methods

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends