Cohen&#39;S Kappa

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Cohen'S Kappa

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n94
Subject:Quantitative/Statistical Research, Test & Measurement
Keywords:kappa

Request Permissions

Show page numbers Hide page numbers

Cohen's kappa statistic was developed to correct for the problem of inflated percent agreement statistics that occur when marginal values on the variables being compared are unevenly distributed. Kappa is typically used with nominal level variables and is often seen in situations in which two independent raters have the task of classifying an object as belonging to a single level of a nominal variable.

For example, consider the following data set in which two raters, Steve and Damian, are asked to read 100 school mission statements. For each mission statement, they are to make a judgment as to the dominant purpose of schooling set forth in the mission statement. Each school may have only one dominant theme, and the theme should fit into one of the following categories: (a) social, (b) cognitive, (c) civic, or (d) emotional. The results of the rater classifications are shown in Table 1, where the values are reported in terms of percentages (e.g., the value of .05 in the social box indicates that 5 out of 100 schools were classified as having a social purpose as their dominant theme).

The marginal totals indicate the percentage of ratings assigned to each category for each rater. In this example, Steve classified 61% of the 100 mission statements as belonging to the civic category, whereas Damian placed 70% of the mission statements in that category. The diagonal values of Table 1 represent ratings on which the two raters agreed exactly. Thus, the raters agreed on their assignment of 5% percent of the mission statements to the social category, 3% percent to the emotional category, 9% to the cognitive category, and 54% to the civic category. Thus, from a simple percentage agreement perspective, the two raters agreed on 71% of the ratings they assigned. The percent agreement calculation can be derived by summing the values found in the diagonals (i.e., the proportion of times that the two raters agreed). Note that the resultant value of 71% generally represents good agreement:

PA = .05 + .03 + .09 + .54 = .71.

Yet the high percentage agreement statistic is somewhat artificially inflated given that more than half of the school mission statements were rated as having a civic theme. Consequently, a rater with no knowledge or training could actually simply assign a mission statement to the civic category when in doubt, and the raters would end up with percentage agreement statistics that look very good simply because most schools had a civic purpose as their dominant [Page 165]theme. Unfortunately, such an artificially inflated agreement statistic deceives us into believing that the two raters are perhaps more adept at coding the statements than they actually are. The raters actually agree less than half of the time (44% to be exact) when they are assigning codes to mission statements that they rate as falling into all categories other than the civic category.

Table 1 Example Data Matrix: Rater Classifications of 100 School Mission Statements
			Steve
	Social	Emotional	Cognitive	Civic	Marginal Total
Damian
Social	.05(.01)	0(0)	0(.01)	0(.03)	.05
Emotional	.01(.01)	.03 (0)	0 (.01)	.01(.03)	.05
Cognitive	.04(.04)	.01 (.02)	.09 (.02)	.06 (.12)	.20
Civic	.10 (.14)	.03 (.05)	.03 (.08)	.54 (.43)	.70
Marginal total	.20	.07	.12	.61	1.00
Note: Values in parentheses represent the expected proportions on the basis of chance associations, i.e., the joint probabilities of the marginal proportions.

To correct for the problem of inflation and to provide a more accurate estimate of rater agreement, we can calculate Cohen's kappa. To calculate kappa, we must begin by multiplying the marginal totals in order to arrive at an expected proportion for each cell (reported in parentheses in the table). Summing the product of the marginal values in the diagonal, we find that on the basis of chance alone, we expect an observed agreement value of .46:

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Cohen'S Kappa

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends