Skip to main content icon/video/no-internet

Kruskal-Wallis Test

The Kruskal-Wallis test is a nonparametric test to decide whether k independent samples are from different populations. Different samples almost always show variation regarding their sample values. This might be a result of chance (i.e., sampling error) if the samples are drawn from the same population, or it might be a result of a genuine population difference (e.g., as a result of a different treatment of the samples). Usually the decision between these alternatives is calculated by a one-way analysis of variance (ANOVA). But in cases where the conditions of an ANOVA are not fulfilled the Kruskal-Wallis test is an alternative approach because it is a nonparametric method; that is, it does not rely on the assumption that the data are drawn from a probability distribution (e.g., normal distribution).

Related nonparametric tests are the Mann- Whitney U test for only k = 2 independent samples, the Wilcoxon signed rank test for k = 2 paired samples, and the Friedman test for k > 2 paired samples (repeated measurement) and are shown in Table 1.

The test is named after William H. Kruskal and W. Allen Wallis and was first published in the Journal of the American Statistical Association in 1952. Kruskal and Wallis termed the test as the H test; sometimes the test is also named one-way analysis of variance by ranks.

None
None

This entry begins with a discussion of the concept of the Kruskal-Wallis test and provides an example. Next, this entry discusses the formal procedure and corrections for ties. Last, this entry describes the underlying assumptions of the Kruskal-Wallis test.

Concept and Example

The idea of the test is to bring all observations of all k samples into a rank order and to assign them an according rank. After this initial procedure, all further calculations are based only on these ranks but not on the original observations anymore. The underlying concept of the test is that these ranks should be equally distributed throughout the k samples, if all observations are from the same population. A simple example is used to demonstrate this.

A researcher made measurements on k = 3 different groups. Overall there are N = 15 observations. Data are arranged according to their group, and an individual rank is assigned to each observation starting with 1 for the smallest observation (see Table 2).

Two things might be noted here. First, in this example, for the sake of simplicity, all groups have the same number of observations; however, this is not a necessary condition. Second, there are several observations with the same value called tie. In this case, all observations sharing the same values are assigned their mean rank. In the current example, two observations resulted in a value of 1280 sharing ranks 4 and 5. Thus, both receive rank 4.5. Furthermore, three samples had a value of 1310 and would have received ranks from 6 to 8. Now they get rank 7 as the mean of 6, 7, and 8.

In the next step, the sum of ranks (R1,R2,R3) for each group is calculated. The overall sum of ranks is N(N + 1)/2. In the example case, this is 15 x 16 / 2 = 120. As a first control, the sum of ranks for all groups should add up to the same value: R1 + R2 + R3 = 59 + 29:5 + 31:5 = 120.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading