Adjusted P values as part of multiple comparisons.
We were commonly asked why multiple comparisons tests following one-way (or two-way) ANOVA don't report individual P values for each comparison, rather than simply reporting which comparisons are statistically significant. It sounds like a simple question, but the answer is not so simple. In Prism 6, we now offer two possible ways to report a P value for each comparison:
- Don't correct for multiple comparisons at all. Making individual comparisons after ANOVA, without taking into account the number of comparisons, is called the unprotected Fisher's LSD test.
- Report multiplicity adjusted P values. The rest of this page explains what this means.
What are adjusted P values?
Before defining adjusted P values, let's review the meaning of a P value from a single comparison. The P value is the answer to two equivalent questions:
- If the null hypothesis were true, what is the chance that random sampling would result in a difference this large or larger?
- What is the smallest definition of the threshold (alpha) of statistical significance at which this result would be statistically significant?
The latter form of the question is less familiar but equivalent to the first. It leads to a definition of the adjusted P value, which is the answer to this question:
What is the smallest significance level, when applied to an entire family of comparisons, at which a particular comparison will be deemed statistically significant?
The idea is pretty simple. There is nothing special about significance levels of 0.05 or 0.01... You can set the significance level to any probability you want. The adjusted P value is the smallest familywise significance level at which a particular comparison will be declared statistically significant as part of the multiple comparison testing.
Here is a simple way to think about it. You perform multiple comparisons twice. The first time you set the familywise significance level to 5%. The second time, you set it to 1% level. If a particular comparison is statistically significant by the first calculations (5% significance level) but is not for the second (1% significance level), its adjusted P value must be between 0.01 and 0.05, say 0.0323.
A separate adjusted P value is computed for each comparison in a family of comparisons. But the value of these adjusted P values depends on the entire family. The adjusted P value for one particular comparison would have a different value if there were a different number of comparisons or if the data in the other comparisons were changed.
Each comparison will have a unique adjusted P value. But these P values are computed from all the comparisons, and really can't be interpreted for just one comparison. If you added another group to the ANOVA, all of the adjusted P values would change.
Adjusted P values in Prism
Prism can compute multiplicity adjusted P values following Bonferroni, Holm, Tukey or Dunnett multiple comparison testing. Check the option in the third tab of the ANOVA dialog. This feature was introduced in Prism 6.
Note a minor bug in Prism 6 and 7. With Dunnett's test, Prism can only compute adjusted P values that are greater than 0.0001. If the adjusted P value would be less than 0.0001, Prism reports 0.0001 but should report <0.0001.
Learn more about adjusted P values
Four places to learn about adjusted P values;
- Wright defines these adjusted P values and argues for their widespread use (S.P. Wright. Adjusted P-values for simultaneous inference. Biometrics 48:1005-1013,1992).
- In the book by Westfall (citation below).
- Adjusted P values are computed by SAS's PROC MULTTEST statement. However, the SAS documentation does not do a good job of explaining adjusted P values.
- Aickin explains how multiplicity adjusted P values are computed as part of Holm's multiple comparisons (Am. J. Public Health, 86:726, 1996)