﻿ Choosing a normality test

# Choosing a normality test

Prism offers four normality tests. Why is there more than one way to test normality? There are many ways a distribution can deviate from a Gaussian distribution, so different normality tests give different results.

We recommend the D'Agostino-Pearson normality test. It first computes the skewness and kurtosis to quantify how far the distribution is from Gaussian in terms of asymmetry and shape. It then calculates how far each of these values differs from the value expected with a Gaussian distribution, and computes a single P value from the sum of these discrepancies. It is a versatile and powerful normality test, and is recommended. Note that D'Agostino developed several normality tests. The one used by Prism is the "omnibus K2" test.

An alternative is the Anderson-Darling test. It computes the P value by comparing the cumulative distribution of your data set against the ideal cumulative distribution of a Gaussian distribution. It takes into account the discrepancies at all parts of the cumulative distribution curve (unlike the Kolmogorov-Smirnov test, see below).

Another alternative is the Shapiro-Wilk normality test. We prefer the D'Agostino-Pearson test for two reasons. One reason is that, while the Shapiro-Wilk test works very well if every value is unique, it does not work as well when several values are identical. The other reason is that the basis of the test is hard to understand. There are several ways to compute the Shapiro-Wilk test. Prism uses the method of Royston (1).

Earlier versions of Prism offered only the Kolmogorov-Smirnov test. We still offer this test (for consistency) but no longer recommend it. It computes a P value from a single value: the largest discrepancy between the cumulative distribution of the data and a cumulative Gaussian distribution. This is not a very sensitive way to assess normality, and we now agree with this statement1: "The Kolmogorov-Smirnov test is only a historical curiosity. It should never be used." (2). Note that both this test and the Anderson-Darline test compare the actual and ideal cumulative distributions. The distinction is that Anderson-Darling considers the discrepancies at all parts of the curve, and Kolmogorov-Smirnov only look at the largest discrepancy.

The Kolmogorov-Smirnov method as originally published assumes that you know the mean and SD of the overall population (perhaps from prior work). When analyzing data, you rarely know the overall population mean and SD. You only know the mean and SD of your sample. To compute the P value, therefore, Prism uses the Dallal and Wilkinson approximation to Lilliefors' method (3). Since that method is only accurate with small P values, Prism simply reports “P>0.10” for large P values. In case you encounter any discrepancies, you should know that we fixed a bug in this test many years ago in Prism 4.01 and 4.0b.

## Reference

1.P Royston, Remark AS R94: A Remark on Algorithm AS 181: The W-test for Normality.  Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 44, No. 4 (1995), pp. 547-551

2. RB D'Agostino, "Tests for Normal Distribution" in Goodness-Of-Fit Techniques edited by RB D'Agostino and MA Stephens, Macel Dekker, 1986.

3.Dallal GE and Wilkinson L (1986), "An Analytic Approximation to the Distribution of Lilliefors's Test Statistic for Normality," The American Statistician, 40, 294-296.