Key concepts: Multiple comparison tests

Print this Topic

Terminology

The terminology is not always used consistently.

The term "multiple comparison test" applies whenever you make several comparisons at once. The term "post tests" is often used interchangeably.

If you decide which comparisons you want to make after looking at the data, those comparisons are called "post hoc tests".

If you focus on a few scientifically sensible comparisons chosen in advance, those are called "planned comparisons". These choices must be based on the scientific questions you are asking, and must be chosen when you design the experiment. Some statisticians argue that you don't need to correct for multiple comparisons when you do planned comparisons.

How multiple comparison tests work

All the multiple comparison tests (except planned comparisons) following one-way ANOVA do two things differently than a regular t test:

First, the tests take into account the scatter of all the groups. This gives you a more precise value for scatter (Mean Square of Residuals) which is reflected in more degrees of freedom. When you compare mean A to mean C, the test compares the difference between means to the amount of scatter. With multiple comparison tests, scatter is quantified using information from all the groups, not just groups A and C. This gives the test more power to detect differences, and only makes sense when you accept the assumption that all the data are sampled from populations with the same standard deviation, even if the means are different.
The other aspect of multiple comparisons is an attempt to make the significance level apply to the entire family of comparisons, rather than to each comparison individually. That means if all the groups really have the same mean, there is a 5% chance that any one or more of the comparisons would reach a "statistically significant" conclusion by chance. To do this, the multiple comparison methods use a stricter definition of significance. This makes it less likely to make a Type I error (finding a 'significant' result by chance) but at the cost of decreasing the power to detect real differences. If you are only making a few comparisons, the correction is smaller than it would be if you made lots of comparisons, so the loss of power is smaller.

Note that these two aspects of multiple comparison tests have opposite effects on the power of the test. The increased power due to more degrees of freedom can be greater or smaller than the decreased power due to a stricter significance threshold. In most cases, however, the loss of power due to the stricter significance threshold is much larger than the gain due to increased numbers of degrees of freedom.

 



Copyright (c) 2007 GraphPad Software Inc. All rights reserved.
URL: http://www.graphpad.com/help/Prism5/Prism5Help.html?terminology_post_hoc_mutplie_c_2.htm