Feedback on: GraphPad Statistics Guide - Interpreting results: Skewness and kurtosisSTAT_Skewness_and_kurtosisSTATISTICS WITH PRISM 6 > Descriptive statistics and frequency distributions > Column statistics > Interpreting results: Skewness and kurtosis /Dear GraphPad,
Interpreting results: Skewness and kurtosis
Interpreting skewness and kurtosis
Skewness quantifies how symmetrical the distribution is.
•
A symmetrical distribution has a skewness of zero.
•
An asymmetrical distribution with a long tail to the right (higher values) has a positive skew.
•
An asymmetrical distribution with a long tail to the left (lower values) has a negative skew.
•
The skewness is unitless.
•
Any threshold or rule of thumb is arbitrary, but here is one: If the skewness is greater than 1.0 (or less than -1.0), the skewness is substantial and the distribution is far from symmetrical.
Kurtosis quantifies whether the shape of the data distribution matches the Gaussian distribution.
•
A Gaussian distribution has a kurtosis of 0.
•
A flatter distribution has a negative kurtosis,
•
A distribution more peaked than a Gaussian distribution has a positive kurtosis.
•
Kurtosis has no units.
•
The value that Prism reports is sometimes called the excess kurtosis since the expected kurtosis for a Gaussian distribution is 0.0.
•
An alternative definition of kurtosis is computed by adding 3 to the value reported by Prism. With this definition, a Gaussian distribution is expected to have a kurtosis of 3.0.
How skewness is computed
Skewness has been defined in multiple ways. The steps below explain the method used by Prism, called g1 (the most common method). It is identical to the skew() function in Excel.
1.
We want to know about symmetry around the sample mean. So the first step is to subtract the sample mean from each value, The result will be positive for values greater than the mean, negative for values that are smaller than the mean, and zero for values that exactly equal the mean.
2.
To compute a unitless measures of skewness, divide each of the differences computed in step 1 by the standard deviation of the values. These ratios (the difference between each value and the mean divided by the standard deviation) are called z ratios. By definition, the average of these values is zero and their standard deviation is 1.
3.
For each value, compute z3. Note that cubing values preserves the sign. The cube of a positive value is still positive, and the cube of a negative value is still negative.
4.
Average the list of z3 by dividing the sum of those values by n-1, where n is the number of values in the sample. If the distribution is symmetrical, the positive and negative values will balance each other, and the average will be close to zero. If the distribution is not symmetrical, the average will be positive if the distribution is skewed to the right, and negative if skewed to the left. Why n-1 rather than n? For the same reason that n-1 is used when computing the standard deviation.
5.
Correct for bias. For reasons that I do not really understand, that average computed in step 4 is biased with small samples -- its absolute value is smaller than it should be. Correct for the bias by multiplying the mean of z3 by the ratio n/(n-2). This correction increases the value if the skewness is positive, and makes the value more negative if the skewness is negative. With large samples, this correction is trivial. But with small samples, the correction is substantial.