Skip to main content icon/video/no-internet

The Kendall rank correlation coefficient evaluates the degree of similarity between two sets of ranks given to the same set of objects. This coefficient depends upon the number of inversions of pairs of objects that would be needed to transform one rank order into the other. In order to do so, each rank order is represented by the set of all pairs of objects (e.g., [a,b] and [b,a] are the two pairs representing the objects a and b), and a value of 1 or 0 is assigned to this pair when its order corresponds or does not correspond to the way these two objects were ordered. This coding schema provides a set of binary values that is then used to compute a Pearson correlation coefficient.

Notations and Definition

Let S be a set of N objects,

None

When the order of the elements of the set is taken into account, we obtain an ordered set that can also be represented by the rank order given to the objects of the set. For example, with the following set of N = 4 objects,

None

the ordered set O1 = [a, c, b, d] gives the ranks R1 = [1, 3, 2, 4]. An ordered set on N objects can be decomposed into ½N(N–1) ordered pairs. For example, O1 is composed of the following six ordered pairs:

None

In order to compare two ordered sets (on the same set of objects), the approach of Kendall is to count the number of different pairs between two ordered sets. This number gives a distance between these sets called the symmetric difference distance (the symmetric difference is a set operation that associates to two sets the set of elements that belongs to only one set).

The symmetric difference distance between two sets of ordered pairs P1 and P2 is denoted dΔ(P1, P2).

The Kendall coefficient of correlation is obtained by normalizing the symmetric difference such that it will take values between −1 and +1, with −1 corresponding to the largest possible distance (obtained when one order is the exact reverse of the other order) and +1 corresponding to the smallest possible distance (equal to 0, obtained when both orders are identical). Taking into account that the maximum number of pairs that can differ between two sets with 1/2N(N–1) elements is equal to N(N–1), this gives the following formula for the Kendall rank correlation coefficient:

None

How should the Kendall coefficient be interpreted? Because τ is based upon counting the number of different pairs between two ordered sets, its interpretation can be framed in a probabilistic context. Specifically, for a pair of objects taken at random, τ can be interpreted as the difference between the probability of these objects being in the same order [denoted P(same)] and the probability of these objects being in a different order [denoted P(different)]. Formally, we have

None

An Example

Suppose that two experts order four wines called {a, b, c, d}. The first expert gives the following order: O1 = [a, c, b, d], which corresponds to the following ranks: R1= [1, 3, 2, 4]; and the second expert orders the wines as O2 = [a, c, d, b], which corresponds to the following ranks: R2 = [1, 4, 2, 3]. The order given by the first expert is composed of the following six ordered

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading