Skip to main content icon/video/no-internet

Intercoder Reliability Coefficients, Comparison of

Reliability coefficients are designed to measure the degree to which data warrant serious considerations in subsequent analyses. They call for reliability data that consist of several independent replications of the process by which phenomena of scholarly interest are converted into analyzable data. This process may be embodied in mechanical measuring devices, performed during standardized tests, or enacted by human coders who are formally instructed to interpret selected phenomena, distinguish among the units that are relevant to a research question, and assign well-defined categories or numerical values to them. Written coding instructions are provided to standardize the process across all coders employed, and it is these coding instructions that need to be reproducible elsewhere as well for the resulting data to be considered reliable. This entry discusses criteria for reliable data and compares several agreement coefficients, highlighting the differences.

Criteria for Reliable Data

It is important not to confuse agreement with reliability.

Agreement is measured among independent replications, whether they are due to multiple observers reporting on a chain of events, several coders using their literacy competencies in categorizing given texts, or different raters judging a set of performances.

Reliability is inferred from observed agreements. To serve as indications of reliability, agreement must be measured on independent replications and reveal the degree to which the resulting data (a) can serve as surrogates for the phenomena of interest, (b) are of a form amenable to available analytical or computational methods of analysis, and (c) can provide sufficient information (i.e., exhibit the variation needed to answer the questions that guide a research project). Not all agreement measures can assure researchers of the reliability of their data. Declaring an agreement measure to be a reliability coefficient does not make it one.

Agreement measures that aspire to be indicative of the reliability of data must provide a scale with two numerically distinct values. The obvious one is agreement without exception, which supports the inference that data are perfectly reliable. It usually measures 1. The other must indicate the condition under which reliability is completely absent, usually measuring 0. With reference to the three conditions previously mentioned, reliability should be considered absent when (a) no relationship exists between the phenomena of interest and the data that are to be analyzed in their place, (b) data are ambiguous or not in the form that a chosen analysis requires, and (c) data do not exhibit the variation needed to lead researchers to valid conclusions.

With the phenomena of interest generally no longer accessible once data are created, observed agreements among replications can say little about what the data actually represent. When interpretable as reliability coefficients, they can merely assess the extent to which the distinctions that researchers are able to make in the data correspond to differences among the phenomena of interest as they had been seen and described by observers, coders, or judges employed in the process of generating the data.

This epistemological condition suggests that while reliable data cannot guarantee valid research results, the probability of deriving valid conclusion from them diminishes with their increasing unreliability.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading