Contents

Statistical principles:

The need for statistics

Sample vs population

Gaussian distribution

Confidence intervals

P values

Statistical significance

Power

Bayes

Multiple comparisons

Analyzing one group

Analyzing two groups

Analysis of variance (ANOVA)

Analyzing survival data

Categorical data
(contingency tables)

Correlation & linear regression

Our Products...
Prism
InStat
StatMate
Intuitive Biostatistics


© 1999 GraphPad Software Inc.

The Prism Guide to Interpreting Statistical Results
This guide is excerpted from Analyzing Data with GraphPad Prism, a book that accompanies the program GraphPad Prism. Browse this guide using the Contents navigation on the left. You may also download the entire book.

The Gaussian distribution

When many independent random factors act in an additive manner to create variability, data will follow a bell-shaped distribution called the Gaussian distribution, illustrated in the figure below. The left panel shows the distribution of a large sample of data. Each value is shown as a dot, with the points moved horizontally to avoid too much overlap. This is called a column scatter graph. The frequency distribution, or histogram, of the values is shown in the middle panel. It shows the exact distribution of values in this particular sample. The right panel shows an ideal Gaussian distribution.

The Gaussian distribution has some special mathematical properties that form the basis of many statistical tests. Although no data follow that mathematical ideal exactly, many kinds of data follow a distribution that is approximately Gaussian.

The Gaussian distribution is also called a Normal distribution. Don't confuse this use of the word "normal" with its usual meaning.

The Gaussian distribution plays a central role in statistics because of a mathematical relationship known as the Central Limit Theorem. To understand this theorem, follow this imaginary experiment:

  1. Create a population with a known distribution (which does not have to be Gaussian).
  2. Randomly pick many samples from that population. Tabulate the means of these samples.
  3. Draw a histogram of the frequency distribution of the means.

The central limit theorem says that if your samples are large enough, the distribution of means will follow a Gaussian distribution even if the population is not Gaussian. Since most statistical tests (such as the t test and ANOVA) are concerned only with differences between means, the Central Limit Theorem lets these tests work well even when the populations are not Gaussian. For this to be valid, the samples have to be reasonably large. How large is that? It depends on how far the population distribution differs from a Gaussian distribution. Assuming the population doesn't have a really weird distribution, a sample size of ten or so is generally enough to invoke the Central Limit Theorem.

To learn more about why the ideal Gaussian distribution is so useful, read about the Central Limit Theorem in any statistics text.