## Please enable JavaScript to view this site.

 Interpreting results: Kolmogorov-Smirnov test

## Key facts about the Kolmogorov-Smirnov test

The two sample Kolmogorov-Smirnov test is a  nonparametric test that compares the cumulative distributions of two data sets(1,2).

The test is nonparametric. It does not assume that data are sampled from Gaussian distributions (or any other defined distributions).

The results will not change if you transform all the values to logarithms or reciprocals or any transformation. The KS test report the maximum difference between the two cumulative distributions, and calculates a P value from that and the sample sizes. A transformation will stretch (even rearrange if you pick a strange transformation) the X axis of the frequency distribution, but cannot change the maximum distance between two frequency distributions.

Converting all values to their ranks also would not change the maximum difference between the cumulative frequency distributions (pages 35-36 of Lehmann, reference 2). Thus, although the test analyzes the actual data, it is equivalent to an analysis of  ranks. Thus the test is fairly robust to outliers (like the Mann-Whitney test).

The null hypothesis is that both groups were sampled from populations with identical distributions. It tests for any violation of that null hypothesis -- different medians, different variances, or different distributions.

Because it tests for more deviations from the null hypothesis than does the Mann-Whitney test, it has less power to detect a shift in the median but more power to detect changes in the shape of the distributions (Lehmann, page 39).

Since the test does not compare any particular parameter (i.e. mean or median), it does not report any confidence interval.

Don't use the Kolmogorov-Smirnov test if the outcome (Y values) are categorical, with many ties. Use it only for ratio or interval data, where ties are rare.

The concept of one- and two-tail P values only makes sense when you are looking at an outcome that has two possible directions (i.e. difference between two means). Two cumulative distributions can differ in lots of ways, so the concept of tails is not really appropriate. the P value reported by Prism essentially has many tails. Some texts call this a two-tail P value.

## Interpreting the P value

The P value is the answer to this question:

If the two samples were randomly sampled from identical populations, what is the probability that the two cumulative frequency distributions would be as far apart as observed? More precisely, what is the chance that the value of the Komogorov-Smirnov D statistic would be as large or larger than observed?

If the P value is small, conclude that the two groups were sampled from populations with different distributions. The populations may differ in median, variability or the shape of the distribution.

## Graphing the cumulative frequency distributions

The KS test works by comparing the two cumulative frequency distributions, but it does not graph those distributions. To do that, go back to the data table, click Analyze and choose the Frequency distribution analysis. Choose that you want to create cumulative distributions and tabulate relative frequencies.

## Don't confuse with the KS normality test

It is easy to confuse the two sample Kolmogorov-Smirnov test (which compares two groups) with the one sample Kolmogorov-Smirnov test, also called the Kolmogorov-Smirnov goodness-of-fit test, which tests whether one distribution differs substantially from theoretical expectations.

The one sample test is most often used as a normality test to compare the distribution of data in a single dat  aset with the predictions of a Gaussian distribution. Prism performs this normality test as part of the Column Statistics analysis.

## Comparison with the Mann-Whitney test

The Mann-Whitney test is also a nonparametric test to compare two unpaired groups. The Mann-Whitney test works by ranking all the values from low to high, and comparing the mean rank of the values in the two groups.

## How Prism computes the P value

Prism first generates the two cumulative relative frequency distributions, and then asks how far apart those two distributions are at the point where they are furthest apart. Prism uses the method explained by Lehmann (2). This distance is reported as Kolmogorov-Smirnov D.

The P value is computed from this maximum distance between the cumulative frequency distributions, accounting for sample size in the two groups. With larger samples, an excellent approximation is used (2, 3).

An exact method is used when the samples are small, defined by Prism to mean when the number of permutations of n1 values from n1+n2 values is less than 60,000, where n1 and n2 are the two sample sizes. Thus an exact test is used for these pairs of group sizes (the two numbers in parentheses are the numbers of values in the two groups):

(2, 2), (2, 3) ... (2, 346)

(3, 3), (3, 4) ... (3, 69)

(4, 4), (4, 5) ... (4, 32)

(5, 5), (5, 6) ... (5, 20)

(6, 6), (6, 7) ... (6, 15)

(7, 7), (7, 8) ... (7, 12)

(8, 8), (8, 9), (8, 10)

(9, 9)

Prism accounts for ties in its exact algorithm (developed in-house). It systematically shuffles the actual data between two groups (maintaining sample size). The P value it reports is the fraction of these reshuffled data sets where the D computed from the reshuffled data sets is greater than or equal than the D computed from the actual data.

## References

1. Kirkman, T.W. (1996) Statistics to Use: Kolmogorov-Smirnov test. (Accessed 10 Feb 2010)

2. Lehmann, E.  (2006), Nonparametrics: Statistical methods based on ranks.  ISBN: 978-0387352121

3.  WH Press, et. al, Numerical Recipes, third edition, Cambridge Press, ISBN: 0521880688