Frequently Asked Questions
Difference between "planned comparisons", "post-hoc tests", "multiple comparison tests" , "post tests", and "orthogonal comparisons".
FAQ# 1091 Last Modified 1-January-2009
The ANOVA calculations test the null hypothesis that all groups of data really are sampled from distributions that have the same mean (so any observed differences are just due to coincidences of random sampling). Testing this hypothesis is rarely the reason you did the experiment. Instead, you want to look within the data, comparing this group with that group... So you want to make multiple comparisons. There are several ways you can do this:
- All possible comparisons, including averages of groups. So you might compare the average of groups A and B with the average of groups C, D and E. Or compare group A, to the average of B-F. Scheffe's test (not currently offered by any GraphPad program) does this.
- All possible pairwise comparisons. Compare the mean of every group with the mean of every other group. Prism and InStat can do these comparisons with Tukey or Newman-Keuls comparisons.
- All against a control. If group A is the control, you may only want to compare A with B, A with C, A with D... but not compare B with C or C with D. Prism and InStat do this with Dunnett's test
- Only a few comparisons based on your scientific goals. So you might want to compare A with B and B with C and that's it. Prism and InStat use Bonferroni's test for this.
The terminology is not always used consistently.
Multiple comparison test applies whenever you make several comparisons at once. This is in the case in all four situations above.
Post test is generally used interchangeably with multiple comparison test, so applies to all the situations above.
Post-hoc test is used for situations where you can decide which comparisons you want to make after looking at the data. You don't need to plan ahead. Scenario I above clearly is post-hoc testing. Scenarios II and III above require that you make some decisions (based on your scientific goals). If you compare all pairs of means, you have decided not to do the more general comparisons listed in part I. If you choose to compare all against the mean, you've decided on scientific grounds not to compare treatment groups with each other, but only to compare each to the control. Because the use of these tests require some decisions that you should make before seeing any data, choices II and III are not strictly post-hoc tests, but that term sometimes gets used anyway
Note: The term 'post hoc' means you can decide after the experiment. The usage in multiple comparison testing is not related to the Latin phrase "post hoc, ergo propter hoc," which refers to a false conclusion that one event caused another just because it came first.
Planned comparison tests require that you focus in on a few scientifically sensible comparisons. You can't decide which comparisons to do after looking at the data. The choice must be based on the scientific questions you are asking, and be chosen when you design the experiment. Hence the term planned comparisons. Scenario IV above is clearly planned comparisons. Scenarios II and III do require some planning, but are not usually lumped in with the term 'planned comparison test'.
Analyzing planned comparisons can be done in several ways. See: When I do planned comparisons after one-way ANOVA, do I need to correct for multiple comparisons?
When you only make a few comparison, the comparisons are called "orthogonal" when the each comparison is among different groups. Comparing Groups A and B is orthogonal to comparing Groups C and D, because there is no information in the data from groups A and B that is relevant when comparing Groups C and D. In contrast, comparing A and B is not orthogonal to comparing B and C.
When comparisons are orthogonal, the comparison can use ordinary t tests. You may still want to use the Bonferroni correction to adjust the significance level. The decision is the same as with planned comparisons, discussed above. When comparisons are not orthogonal, you gain power by accounting for the overlap. That is why the Tukey, Dunnett, Newman-Keuls tests (etc) are different than a plain t test.