GraphPad Statistics Guide

Interpreting results: Kurtosis

Interpreting results: Kurtosis

Previous topic Next topic No expanding text in this topic  

Interpreting results: Kurtosis

Previous topic Next topic JavaScript is required for expanding text JavaScript is required for the print function Mail us feedback on this topic!  

Kurtosis

Kurtosis quantifies whether the tails of the data distribution matches the Gaussian distribution.

A Gaussian distribution has a kurtosis of 0.

A distribution with fewer values in the tails than a Gaussian distribution has a negative kurtosis.

A distribution with more values in the tails (or values further out in the tails) than a Gaussian distribution has a positive kurtosis.

Kurtosis has no units.

Although it is commonly thought to measure the shape of the peak, kurtosis actually tells you virtually nothing about the shape of the peak. Its only unambiguous interpretation is in terms of the values in the tail. Essentially it measures the presence of outliers (1).

The value that Prism reports is sometimes called the excess kurtosis since the expected kurtosis for a Gaussian distribution is 0.0.

An alternative definition of kurtosis is computed by adding 3 to the value reported by Prism. With this definition, a Gaussian distribution is expected to have a kurtosis of 3.0.

How Kurtosis is computed

1.Subtract the sample mean from each value,  The result will be positive for values greater than the mean, negative for values that are smaller than the mean, and zero for values that exactly equal the mean.

2.Divide each of the differences computed in step 1 by the standard deviation of the values. These ratios (the difference between each value and the mean divided by the standard deviation) are called z ratios. By definition, the average of these values is zero and their standard deviation is 1.

3.For each value, compute z4. In case that doesn't render well, that is z to the fourth power.  All these values are positive.

4.Average that list of values  by dividing the sum of those values by n-1, where n is the number of values in the sample. Why n-1 rather than n? For the same reason that n-1 is used when computing the standard deviation.

5.With a Gaussian distribution, you expect that average to equal 3. Therefore, subtract 3 from that average. Gaussian data are expected to have a kurtosis of 0. This value (after subtracting 3) is sometimes called the excess kurtosis.

Why don't values in the middle of the distribution affect the kurtosis very much?

Because the z values are taken to the fourth power, only large z values (so only values far from the mean) have a big impact on the kurtosis. If one value has a z value of 1 and another has a z value of 2, the second value will have 16 times more impact on the kurtosis (because 2 to the fourth power is 16). If one value has a z value of 1 and another has a z value of 3 (so is three times further from the mean), the second value will have 81 times more impact on the kurtosis (because 3 to the fourth power is 81). Accordingly, values near the mean (especially those less than one SD from the mean) have very little impact on the kurtosis, while values far from the mean have a huge impact. For this reason, the kurtosis does not quantify peakedness and does not really quantify the shape of the bulk of the distribution. Rather kurtosis quantifies the overall impact of points far from the mean.

Reference

1.Westfall, P. H. (2014). Kurtosis as Peakedness, 1905–2014. R.I.P. The American Statistician, 68(3), 191–195.