Intercoder Reliability Techniques: Fleiss System

Mike Allen

doi:10.4135/9781483381411

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Intercoder Reliability Techniques: Fleiss System

By: Mike Allen
In:The SAGE Encyclopedia of Communication Research Methods
Chapter DOI:https://doi.org/10.4135/9781483381411.n257
Subject:Communication Research Methods (general)
Keywords:coding; intercoder reliability; kappa

Request Permissions

Show page numbers Hide page numbers

The use of multiple coders to make evaluations and provide an assessment of some quality represents a common practice in research. For example, suppose a researcher is coding a discourse from a transcript between persons engaged in a conversation involving a conflict. A potential set of codes could be that one person’s turn in the conversation could be viewed as (a) an attack on the comments of the other person, (b) an attempt to bolster or defend one’s self, (c) a move to integrate, collaborate, or seek common ground, or (d) some other comment.

The challenge is that the person’s conversational move must be evaluated and a meaning assigned (using the coding system) by an outside observer. The coding of utterances (or any other feature) by an outside observer becomes subjected to the interpretation and application of the classification scheme. Although many turns in a conversation may appear easy and obvious when assigning an evaluation, other conversational turns may not be so obvious. When examining the coding or classification scheme, one question to ask is whether other persons using the same definition and categories would make assignments in the same manner as the researcher.

Intercoder reliability reflects the degree of agreement on the assignments of values by persons working independently to make determinations using a common system of evaluation. Many different approaches exist, such as Cohen’s kappa, Krippendorf’s alpha, and the Holsti method. This entry examines the use of the Fleiss system for examining intercoder reliability. The critical element of any intercoder reliability system is dividing agreement into the level of agreement that occurs due to random chance and any other levels of agreement.

Defining Fleiss Method

The element of Fleiss’s kappa, provides a measure of handling the issues of intercoder reliability when more than two coders are employed to evaluate an issue. The type of scale or coding involves the use of nominal or binary rating systems. No version is available for ordinal or interval scale systems, as the general recommendation for that kind of scale would involve the use of a version of Cronbach’s alpha to assess the reliability of the coders.

The general equation for Kappa is the following:

$k = (average P - average P_{e}) / (1 - average P_{e}),$

[Page 739]where P indicates the average percentage of observed agreement among the coders and the Pe indicates that average percentage of agreement for each coder due to random chance. The effect of Kappa is to provide an agreement among coders removing or considering the random chance that the two coders would agree. The reliability of the coding scheme depends on coders’ ability to apply the categories in a manner of agreement greater than random chance. This equation is the same as the one used for Cohen’s kappa; the issue is a change, accounting for multiple coders, in providing an estimate of the probability of agreement and random agreement.

To calculate the average percentage agreement, the number of agreements for each category is calculated using the following equation:

$P = {1 / [n * (n - 1)]} * {\sum (n_{i j}^{2}) - n} .$

This equation indicates the number of possible combinations compared to the number of observed agreements (nij2). This agreement is the agreement across the total number of categories for the particular element or coding scheme. To calculate the average P, the value of P, using this equation for each category in the scheme is used to create an average. For example, if a scheme were to code the clothing of a person in a high school as (a) geek, (b) goth, (c) jock, or (d) tweeker, there would be four different applications of the P equation for each category. The average of P is simply the arithmetic average of the four values for each of the separate categories in the system.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Intercoder Reliability Techniques: Fleiss System

Defining Fleiss Method

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends