GraphPad Statistics Guide

Confidence interval of a standard deviation

Confidence interval of a standard deviation

Previous topic Next topic No expanding text in this topic  

Confidence interval of a standard deviation

Previous topic Next topic JavaScript is required for expanding text JavaScript is required for the print function Mail us feedback on this topic!  

A confidence interval can be computed for almost any value computed from a sample of data, including the standard deviation.

The SD of a sample is not the same as the SD of the population

It is straightforward to calculate the standard deviation from a sample of values. But how accurate is that standard deviation? Just by chance you may have happened to obtain data that are closely bunched together, making the SD low. Or you may have randomly obtained values that are far more scattered than the overall population, making the SD high. The SD of your sample does not equal, and may be quite far from, the SD of the population.

Confidence intervals are not just for means

Confidence intervals are most often computed for a mean. But the idea of a confidence interval is very general, and you can express the precision of any computed value as a 95% confidence interval (CI). Another example is a confidence interval of a best-fit value from regression, for example a confidence interval of a slope.

The 95% CI of the SD

The sample SD is just a value you compute from a sample of data. It's not done often, but it is certainly possible to compute a CI for a SD. GraphPad Prism does not do this calculation, but a free GraphPad QuickCalc does.

Interpreting the CI of the SD is straightforward. If you assume that your data were randomly and independently sampled from a Gaussian distribution, you can be 95% sure that the CI  contains the true population SD.

How wide is the CI of the SD? Of course the answer depends on sample size (n). With small samples, the interval is quite wide as shown in the table below.

n        95% CI of SD

2        0.45*SD to 31.9*SD

3        0.52*SD to 6.29*SD

5        0.60*SD to 2.87*SD

10        0.69*SD to 1.83*SD

25        0.78*SD to 1.39*SD

50        0.84*SD to 1.25*SD

100        0.88*SD to 1.16*SD

500        0.94*SD to 1.07*SD

1000        0.96*SD to 1.05*SD

Example

Data: 23, 31, 25, 30, 27

Mean:        27.2

SD:        3.35

The sample standard deviation computed from the five values  is 3.35. But the true standard deviation of the population from which the values were sampled might be quite different. From the n=5 row of the table, the 95% confidence interval extends from 0.60 times the SD to 2.87 times the SD. Thus the 95% confidence interval ranges from  0.60*3.35 to 2.87*3.35,  from 2.01 to 9.62. When you compute a SD from only five values, the upper 95% confidence limit for the SD is almost five times the lower limit.

Most people are surprised that small samples define the SD so poorly. Random sampling can have a huge impact with small data sets, resulting in a calculated standard deviation quite far from the true population standard deviation.

Note that the confidence interval is not symmetrical around the computed SD. Why? Since the SD is always a positive number, the lower confidence limit can't be less than zero. This means that the upper confidence interval usually extends further above the sample SD than the lower limit extends below the sample SD. With small samples, this asymmetry is quite noticeable.

Computing the Ci of a SD with Excel

These Excel equations compute the confidence interval of a SD. n is sample size; alpha is 0.05 for 95% confidence, 0.01 for 99% confidence, etc.:

Lower limit: =SD*SQRT((n-1)/CHIINV((alpha/2), n-1))

Upper limit: =SD*SQRT((n-1)/CHIINV(1-(alpha/2), n-1))

These equations come from page 197-198 of Sheskin (reference below).

Reference

David J. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, Fourth Edition, IBSN:1584888148.