Skip to main content icon/video/no-internet

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov (KS) test is one of many goodness-of-fit tests that assess whether univariate data have a hypothesized continuous probability distribution. The most common use is to test whether data are normally distributed. Many statistical procedures assume that data are normally distributed. Therefore, the KS test can help validate use of those procedures. For example, in a linear regression analysis, the KS test can be used to test the assumption that the errors are normally distributed. However, the KS test is not as powerful for assessing normality as other tests such as the Shapiro-Wilk, Anderson-Darling, and Bera- Jarque tests that are specifically designed to test for normal distributions. That is, if the data are not normal, the KS test will erroneously conclude that they are normal more frequently than will the other three mentioned tests. Yet the KS test is better in this regard than the widely used chi-square goodness-of-fit test. Nevertheless, the KS test is valid for testing data against any specified continuous distribution, not just the normal distribution. The other three mentioned tests are not applicable for testing non-normal distributions. Moreover, the KS test is distribution free, which means that the same table of critical values might be used—whatever the hypothesized continuous distribution, normal or otherwise.

This entry discusses the KS test in relation to estimating parameters, multiple samples, and goodness-of-fit tests. An example illustrating the application and evaluation of a KS test is also provided.

Estimating Parameters

Most properties of the KS test have been developed for testing completely specified distributions. For example, one tests not just that the data are normal, but more specifically that the data are normal with a certain mean and a certain variance. If the parameters of the distribution are not known, it is common to estimate parameters in order to obtain a completely specified distribution. For example, to test whether the errors in a regression have a normal distribution, one could estimate the error variance by the mean-squared error and test whether the errors are normal with a mean of zero and a variance equal to the calculated mean-squared error. However, if parameters are estimated in the KS test, the critical values in standard KS tables are incorrect and substantial power can be lost. To permit parameter estimation in the KS test, statisticians have developed corrected tables of critical values for testing special distributions. For example, the adaptation of the KS test for testing the normal distribution with estimated mean and variance is called the Lilliefors test.

Multiple-Sample Extensions

The KS test has been extended in other ways. For example, there is a two-sample version of the KS test that is used to test whether two separate sets of data have the same distribution. As an example, one could have a set of scores for males and a set of scores for females. The two-sample KS test could be used to determine whether the distribution of male scores is the same as the distribution of female scores. The two-sample KS test does not require that the form of the hypothesized common distribution be specified. One does not need to specify whether the distribution is normal, exponential, and so on, and no parameters are estimated. The two-sample KS test is distribution free, so just one table of critical values suffices. The KS test has been extended further to test the equality of distributions when the number of samples exceeds two. For example, one could have scores from several different cities.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading