Skip to main content icon/video/no-internet

Cohen's kappa statistic was developed to correct for the problem of inflated percent agreement statistics that occur when marginal values on the variables being compared are unevenly distributed. Kappa is typically used with nominal level variables and is often seen in situations in which two independent raters have the task of classifying an object as belonging to a single level of a nominal variable.

For example, consider the following data set in which two raters, Steve and Damian, are asked to read 100 school mission statements. For each mission statement, they are to make a judgment as to the dominant purpose of schooling set forth in the mission statement. Each school may have only one dominant theme, and the theme should fit into one of the following categories: (a) social, (b) cognitive, (c) civic, or (d) emotional. The results of the rater classifications are shown in Table 1, where the values are reported in terms of percentages (e.g., the value of .05 in the social box indicates that 5 out of 100 schools were classified as having a social purpose as their dominant theme).

The marginal totals indicate the percentage of ratings assigned to each category for each rater. In this example, Steve classified 61% of the 100 mission statements as belonging to the civic category, whereas Damian placed 70% of the mission statements in that category. The diagonal values of Table 1 represent ratings on which the two raters agreed exactly. Thus, the raters agreed on their assignment of 5% percent of the mission statements to the social category, 3% percent to the emotional category, 9% to the cognitive category, and 54% to the civic category. Thus, from a simple percentage agreement perspective, the two raters agreed on 71% of the ratings they assigned. The percent agreement calculation can be derived by summing the values found in the diagonals (i.e., the proportion of times that the two raters agreed). Note that the resultant value of 71% generally represents good agreement:

PA = .05 + .03 + .09 + .54 = .71.

Yet the high percentage agreement statistic is somewhat artificially inflated given that more than half of the school mission statements were rated as having a civic theme. Consequently, a rater with no knowledge or training could actually simply assign a mission statement to the civic category when in doubt, and the raters would end up with percentage agreement statistics that look very good simply because most schools had a civic purpose as their dominant theme. Unfortunately, such an artificially inflated agreement statistic deceives us into believing that the two raters are perhaps more adept at coding the statements than they actually are. The raters actually agree less than half of the time (44% to be exact) when they are assigning codes to mission statements that they rate as falling into all categories other than the civic category.

Table 1 Example Data Matrix: Rater Classifications of 100 School Mission Statements
Steve
SocialEmotionalCognitiveCivicMarginal Total
Damian
Social.05(.01)0(0)0(.01)0(.03).05
Emotional.01(.01).03 (0)0 (.01).01(.03).05
Cognitive.04(.04).01 (.02).09 (.02).06 (.12).20
Civic.10 (.14).03 (.05).03 (.08).54 (.43).70
Marginal total.20.07.12.611.00
Note: Values in parentheses represent the expected proportions on the basis of chance associations, i.e., the joint probabilities of the marginal proportions.

To correct for the problem of inflation and to provide a more accurate estimate of rater agreement, we can calculate Cohen's kappa. To calculate kappa, we must begin by multiplying the marginal totals in order to arrive at an expected proportion for each cell (reported in parentheses in the table). Summing the product of the marginal values in the diagonal, we find that on the basis of chance alone, we expect an observed agreement value of .46:

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading