Contents

Statistical principles

Analyzing one group

Analyzing two groups

Analysis of variance (ANOVA):

Choosing an analyses

One-way ANOVA

Repeated measures one-way ANOVA

Kruskal-Wallis test

Friedman's test

Two-way ANOVA

Analyzing survival data

Categorical data
(contingency tables)

Correlation & linear regression

Our Products...
Prism
InStat
StatMate
Intuitive Biostatistics


© 1999 GraphPad Software Inc.

The Prism Guide to Interpreting Statistical Results
This guide is excerpted from Analyzing Data with GraphPad Prism, a book that accompanies the program GraphPad Prism. Browse this guide using the Contents navigation on the left. You may also download the entire book.

Choosing one-way ANOVA and related analyses

Introduction to comparisons of three or more groups

Prism can compare three or more groups with ordinary or repeated measures ANOVA, or with the nonparametric Kruskal-Wallis or Friedman tests. Following ANOVA, Prism can perform the Bonferroni, Tukey, Newman-Keuls, or Dunnett's post test. Following nonparametric ANOVA, Prism can calculate the Dunn's post test.

These tests compare measurements (continuous variables) such as weight, enzyme activity, and receptor number. Different test are used to compare proportions or survival curves.

One-way ANOVA (and related nonparametric tests) compare three or more groups when the data are categorized in one way. For example, you might compare a control group with two treated groups. If your data are categorized in two ways (for example you want to compare control with two treated groups in both males and females) see two-way analysis of variance.

Repeated measures test?

You should choose repeated measures test when the experiment uses matched subjects. Here are some examples:

  • You measure a variable in each subject before, during and after an intervention.
  • You recruit subjects as matched sets. Each subject in the set has the same age, diagnosis and other relevant variables. One of the set gets treatment A, another gets treatment B, another gets treatment C, etc.
  • You run a laboratory experiment several times, each time with a control and several treated preparations handled in parallel.
  • You measure a variable in triplets, or grandparent/parent/child groups.

More generally, you should select a repeated measures test whenever you expect a value in one group to be closer to a particular value in the other groups than to a randomly selected value in the another group.

Ideally, the decision about repeated measures analyses should be made before the data are collected. Certainly the matching should not be based on the variable you are comparing. If you are comparing blood pressures in two groups, it is ok to match based on age or zip code, but it is not ok to match based on blood pressure.

The term repeated measures applies strictly when you give treatments repeatedly to one subject. The other examples are called randomized block experiments (each set of subjects is called a block and you randomly assign treatments within each block). The analyses are identical for repeated measures and randomized block experiments, and Prism always uses the term repeated measures.

ANOVA or nonparametric test?

ANOVA, as well as other statistical tests, assumes that you have sampled data from populations that follow a Gaussian bell-shaped distribution. Biological data never follow a Gaussian distribution precisely, because a Gaussian distribution extends infinitely in both directions, so it includes both infinitely low negative numbers and infinitely high positive numbers! But many kinds of biological data follow a bell-shaped distribution that is approximately Gaussian. Because ANOVA works well even if the distribution is only approximately Gaussian (especially with large samples), these tests are used routinely in many fields of science.

An alternative approach does not assume that data follow a Gaussian distribution. In this approach, values are ranked from low to high and the analyses are based on the distribution of ranks. These tests, called nonparametric tests, are appealing because they make fewer assumptions about the distribution of the data. But there is a drawback. Nonparametric tests are less powerful than the parametric tests that assume Gaussian distributions. This means that P values tend to be higher, making it harder to detect real differences as being statistically significant. If the samples are large the difference in power is minor. With small samples, nonparametric tests have little power to detect differences. With very small groups, nonparametric tests have zero power - the P value will always be greater than 0.05.

You may find it difficult to decide when to select nonparametric tests. You should definitely choose a nonparametric test in these situations:

  • The outcome variable is a rank or score with only a few categories. Clearly the population is far from Gaussian in these cases.
  • One, or a few, values are off scale, too high or too low to measure. Even if the population is Gaussian, it is impossible to analyze these data with a t test or ANOVA. Using a nonparametric test with these data is easy. Assign an arbitrary low value to values too low to measure, and an arbitrary high value to values too high to measure. Since the nonparametric tests only consider the relative ranks of the values, it won't matter that you didn't know one (or a few) of the values exactly.
  • You are sure that the population is far from Gaussian. Before choosing a nonparametric test, consider transforming the data (perhaps to logarithms or reciprocals). Sometimes a simple transformation will convert non-Gaussian data to a Gaussian distribution.

In many situations, perhaps most, you will find it difficult to decide whether to select nonparametric tests. Remember that the Gaussian assumption is about the distribution of the overall population of values, not just the sample you have obtained in this particular experiment. Look at the scatter of data from previous experiments that measured the same variable. Also consider the source of the scatter. When variability is due to the sum of numerous independent sources, with no one source dominating, you expect a Gaussian distribution.

Prism performs normality testing in an attempt to determine whether data were sampled from a Gaussian distribution, but normality testing is less useful than you might hope. Normality testing doesn't help if you have fewer than a few dozen (or so) values.

Your decision to choose a parametric or nonparametric test matters the most when samples are small for reasons summarized here:

Large samples
(> 100 or so)
Small samples
(<12 or so)
Parametric tests
Robust. P value will be nearly correct even if population is fairly far from Gaussian.
Not robust. If the population is not Gaussian, the P value may be misleading.
Nonparametric test
Powerful. If the population is Gaussian, the P value will be nearly identical to the P value you would have obtained from a parametric test. With large sample sizes, nonparametric tests are almost as powerful as parametric tests.
Not powerful.  If the population is Gaussian, the P value will be higher than the P value obtained from ANOVA. With very small samples, it may be impossible for the P value to ever be less than 0.05, no matter how the values differ.
Normality test
Useful. Use a normality test to determine whether the data are sampled from a Gaussian population.
Not very useful. Little power to discriminate between Gaussian and non-Gaussian populations. Small samples simply don't contain enough information to let you make inferences about the shape of the distribution in the entire population.

Which post test?

If you are comparing three or more groups, you may pick a post test to compare pairs of group means. Prism offers these choices:

  • Tukey. Compare all pairs of columns.
  • Newman-keuls. Compare all pairs of columns.
  • Bonferroni. compare all pairs of columns.
  • Bonferroni. Compare selected pairs of columns.
  • Dunnett. Compare all columns vs. control column.
  • Test for linear trend between mean and colulmn number.

Choosing an appropriate post test is not straightforward, and different statistics texts make different recommendations.

Select Dunnett's test if one column represents control data, and you wish to compare all other columns to that control column but not to each other.

Select the test for linear trend, if the columns are arranged in a natural order (i.e. dose or time) and you want to test whether there is a trend so that values increase (or decrease) as you move from left to right across columns.

Select the Bonferroni test for selected pairs of columns when you only wish to compare certain column pairs. You must select those pairs based on experimental design, and ideally should specify the pairs of interest before collecting any data. If you base your decision on the results (i.e. compare the smallest with the largest mean), then you have effectively compared all columns, and it is not appropriate to use the test for selected pairs.

Select the Bonferroni, Tukey, or Newman-Keuls test (also known as the Student-Newman, Keuls or SNK test) if you want to compare all pairs of columns.

The only advantage of the Bonferroni method is that it is easy to understand. Its disadvantage is that it is too conservative, leading to P values that are too high and confidence intervals that are too wide. This is a minor concern when you compare only a few columns, but is a major problem when you have many columns. Don't use the Bonferroni test with more than five groups.

Choosing between the Tukey and Newman-Keuls test is not straightforward, and there appears to be no real consensus among statisticians. The two methods are related, and the rationale for the differences is subtle. The methods are identical when comparing the largest group mean with the smallest. For other comparisons, the Newman-Keuls test yields lower P values. The problem is that it is difficult to articulate exactly what null hypotheses the Newman-Keuls P values test. For that reason, and because the Newman-Keuls test does not generate confidence intervals, we suggest selecting Tukey's test. ( If you select the Tukey test you are actually selecting the Tukey-Kramer test, which includes the  extension by Kramer to allow for unequal sample sizes.)

Confirm test selection

Based on the option boxes you selected, Prism will choose a test for you:

Test Matched Nonparametric
Ordinary one-way ANOVA
No
No
Repeated measures one-way ANOVA
Yes
No
Kruskal-Wallis test No Yes
Friedman test
Yes
Yes