## Please enable JavaScript to view this site.

This guide is for an old version of Prism. Browse the latest version or update Prism
 Analysis checklist: Column statistics

## Descriptive statistics

Value

Meaning

Minimum

The smallest value.

25th percentile

25% of values are lower than this.

Median

Half the values are lower; half are higher.

75th percentile

75% of values are lower than this.

Maximum

The largest value.

Mean

The average.

Standard Deviation

Quantifies variability or scatter.

Standard Error of Mean

Quantifies how precisely the mean is known.

95% confidence interval

Given some assumptions, there is a 95% chance that this range includes the true overall mean.

Coefficient of variation

The standard deviation divided by the mean.

Geometric mean

Compute the logarithm of all values, compute the mean of the logarithms, and then take the antilog. It is a better measure of central tendency when data follow a lognormal distribution (long tail).

Skewness

Quantifies how symmetrical the distribution is. A distribution that is symmetrical has a skewness of 0.

Kurtosis

Quantifies whether the tails of the data distribution matches the Gaussian distribution. A Gaussian distribution has a kurtosis of 0.

## Normality tests

Normality tests are performed for each column of data. Each normality test reports a P value that answers this question:

If you randomly sample from a Gaussian population, what is the probability of obtaining a sample that deviates from a Gaussian distribution as much (or more so) as this sample does?

A small P value is evidence that your data was sampled from a nongaussian distribution. A large P value means that your data are consistent with a Gaussian distribution (but certainly does not prove that the distribution is Gaussian).

Normality tests are less useful than some people guess. With small samples, the normality tests don't have much power to detect nongaussian distributions. Prism won't even try to compute a normality test with fewer than seven values. With large samples, it doesn't matter so much if data are nongaussian, since the t tests and ANOVA are fairly robust to violations of this standard.

Normality tests can help you decide when to use nonparametric tests, but the decision should not be an automatic one.

Inferences

A one-sample t test compares the mean of a each column of numbers against a hypothetical mean that you provide.

The P value answers this question:

If the data were sampled from a Gaussian population with a mean equal to the hypothetical value you entered, what is the chance of randomly selecting N data points and finding a mean as far (or further) from the hypothetical value as observed here?

If the P value is small (usually defined to mean less than 0.05), then it is unlikely that the discrepancy you observed between sample mean and hypothetical mean is due to a coincidence arising from random sampling.

The nonparametric Wilcoxon signed-rank test is similar, but does not assume a Gaussian distribution. It asks whether the median of each column differs from a hypothetical median you entered.