﻿ Relationship between multiple comparisons tests and t tests

# Relationship between multiple comparisons tests and t tests

## Fishers LSD method

The only difference a set of t tests and the Fisher's LSD test, is that t tests compute the pooled SD from only the two groups being compared, while the Fisher's LSD test computes the pooled SD from all the groups (which gains power). Note that unlike the Bonferroni, Tukey, Dunnett and Holm methods, Fisher's LSD does not correct for multiple comparisons.

## Tukey, Dunnet, Bonferroni

Multiple comparisons use a familywise definition of alpha. The significance level doesn't apply to each comparison, but rather to the entire family of comparisons. In general, this makes it harder to reach significance. This is really the main point of multiple comparisons, as it reduces the chance of being fooled by differences that are due entirely to random sampling. Here is an example:

 Group 1 Group 2 Group 3 34 43 48 16 37 43 25 47 69

An unpaired t test comparing Groups 1 and 2 computes a P values of 0.0436, which is less than 0.05 so deemed statistically significant. But a Tukey multiple comparison test after ANOVA computes a multiplicity adjusted P value of 0.1789, which is not statistically significant.

In some cases, the effect of increasing the df (by pooling the SD of all groups, even when only comparing two) overcomes the effect of controlling for multiple comparisons. In these cases, you may find a 'significant' difference in a multiple comparisons test where you wouldn't find it doing a simple t test. Here is an example:

 Group 1 Group 2 Group 3 34 43 48 38 45 49 29 47 47

Comparing groups 1 and 2 by unpaired t test yields a two-tail P value of 0.0164, while the Tukey multiple comparisons test calculates a multiplicity adjusted P value of 0.0073. If we set our threshold of 'significance' for this example to 0.01, the results are not 'statistically significant' with a t test but are statistically significant with the multiple comparisons test.

## FDR approach

When you ask Prism to use the FDR approach (using any of the three methods), it first computes the P values using the Fisher LSD method (if you assume sampling from Gaussian distributions) or the uncorrected Dunn's test (nonparametric). These methods do not correct for multiple comparisons.

Then the three methods decide which of these P values are small enough to be called a "discovery", with the threshold depending on the distribution of P values, the number of P values, and which of the three methods you chose.

The first step, computing the P value is very close to what a conventional t test does. The second step, deciding which are "discoveries" is very different.