Kolmogorov-Smirnov Test for Two Samples

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Kolmogorov-Smirnov Test for Two Samples

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n243
Subject:Quantitative/Statistical Research, Test & Measurement

Request Permissions

Show page numbers Hide page numbers

The two-sample Kolmogorov-Smirnov test is designed to test the hypothesis that two independent groups have identical distributions. A possible appeal of the method is that it can be sensitive to differences between groups that might be routinely missed when using means, medians, or any single measure of location. For example, it might detect differences in the variances or the amount of skewness. More generally, it can detect differences between percentiles that might be missed with many alternative methods for comparing groups. Another positive feature is that it forms the basis of a graphical method for characterizing how groups differ over all the percentiles. That is, it provides an approach to assessing effect size that reveals details missed by other commonly used techniques. Moreover, the test is distribution-free, meaning that assuming random sampling only, the probability of a Type I error can be determined exactly based on the sample sizes used. Historically, the test has been described as assuming that distributions are continuous. More precisely, assuming that tied values occur with probability zero, a recursive method for [Page 515]determining the exact probability of a Type I error is available. But more recently, a method that allows tied values was derived by Schroër and Trenkler.

The details are as follows. Let X1,…, Xn be a random sample from the first group and Y1,…, Ym be a random sample from the second. Let IXi≤x = 1 if Xi ≤ x, otherwise IXi≤x = 0. F1 is estimated with

the proportion of observations less than or equal to x, and F2 is estimated in a similar manner. The null hypothesis is

versus

The test statistic is based on what is sometimes called the Kolmogorov distance, which is just the maximum absolute difference between the two distributions under consideration. For convenience, let Z1,…, ZN be the pooled observations where N = m + n. So the first mZ values correspond to X1,…, Xm. The test statistic is

the maximum being taken over all i = 1,… N.

A variation of the Kolmogorov-Smirnov test is sometimes suggested when there is interest in detecting differences in the tails of the distributions. Let M = nm/N, λ= n/N, and

Now, the difference between any two distributions, at the value x, is estimated with

Then the hypothesis of identical distributions can be tested with an estimate of the largest weighted difference over all possible values of x. The test statistic is

where again the maximum is taken over all values of i, i = 1,…, N, subject to Ĥ(Zi)[1−Ĥ(Zi)] > 0.

Simply rejecting the hypothesis of equal distributions is not very informative. A more interesting issue is where distributions differ and by how much. A useful advance is an extension of the Kolmogorov-Smirnov test that addresses this issue. In particular, it is possible to compute confidence intervals for the difference between all of the quantiles in a manner where the probability of at least one Type I error can be determined exactly.

Suppose c is chosen so that P(D ≤ c) = 1 − α. Denote the order statistics by X(1) ≤ … ≤ X(n) and Y(1) ≤ … ≤ Y(m). For convenience, let X0 =−∞ and X(n+1) =∞. For any x satisfying X(i) ≤ x < X(i+1),

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Kolmogorov-Smirnov Test for Two Samples

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends