Skip to main content icon/video/no-internet

Generalizability (G) theory is a statistical theory for evaluating the dependability (“reliability”) of behavioral measurements. G theory pinpoints the sources of systematic and unsystematic measurement error, disentangles them, and then estimates their magnitudes simultaneously.

In G theory, a behavioral measurement (e.g., an academic self-concept score) is conceived of as a sample from a universe of admissible observations. This universe consists of all possible observations that decision makers consider to be acceptable substitutes (e.g., scores sampled on Occasions 2 and 3) for the observation in hand (scores on Occasion 1). Each characteristic of the measurement situation (e.g., survey form, item, occasion, rater) is a potential source of error and is called a facet of a measurement. The universe of admissible observations is defined by all possible combinations of the levels (called conditions) of the facets. To evaluate the dependability of behavioral measurements, a generalizability (G) study is designed to isolate and estimate as many facets of measurement error as are reasonably and economically feasible.

For example, consider a G study in which all persons in a sample respond to the same randomly sampled 10 academic self-concept items on the same two randomly sampled occasions. In this G study, the facets of the measurement are items and occasions. As the self-concept inventory was designed to capture systematic variation among persons, the object of measurement, persons, is not a source of error and, therefore, is not a facet.

An observed score for a particular person on a particular item and occasion is decomposed into an effect for the grand mean, plus effects for the person, the item, the occasion, each two-way interaction, and a residual (three-way interaction plus unsystematic error). The distribution of each component or “effect,” except for the grand mean, has a mean of zero and a variance σ2 (called the variance component). The variance component for the person effect is called the universe-score variance. The variance components for the other effects are considered error variation.

An estimate of each variance component can be obtained from an analysis of variance (or other methods, such as maximum likelihood). The relative magnitudes of the estimated variance components provide information about systematic differences in self-concept among persons (universe-score variance) and sources of error influencing the measurement. Statistical tests are not used in G theory; instead, standard errors for variance component estimates provide information about sampling variability of estimated variance components.

G theory distinguishes a decision (D) study from a G study. A D study uses variance-component estimates from a G study to design a measurement procedure that minimizes error for a particular purpose.

In planning a D study, the decision maker defines the universe that he or she wishes to generalize to, called the universe of generalization, which may contain some or all of the facets and their conditions in the universe of admissible observations. The D study imports G study information about selected variance to evaluate alternative designs. As decisions usually will be based on the mean (or sum) over multiple observations (e.g., questionnaire items) rather than on a single observation (a single item), D-study designs will change the numbers of conditions of the relevant facets, thereby affecting “reliability”; the more conditions there are, all else being equal, the higher the reliability will be.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading