Contents

Statistical principles:

The need for statistics

Sample vs population

Gaussian distribution

Confidence intervals

P values

Statistical significance

Power

Bayes

Multiple comparisons

Analyzing one group

Analyzing two groups

Analysis of variance (ANOVA)

Analyzing survival data

Categorical data
(contingency tables)

Correlation & linear regression

Our Products...
Prism
InStat
StatMate
Intuitive Biostatistics


© 1999 GraphPad Software Inc.

The Prism Guide to Interpreting Statistical Results
This guide is excerpted from Analyzing Data with GraphPad Prism, a book that accompanies the program GraphPad Prism. Browse this guide using the Contents navigation on the left. You may also download the entire book.

Multiple comparisons

Interpreting an individual P value is easy. If the null hypothesis is true, the P value is the chance that random selection of subjects would result in a difference (or correlation or association...) as large (or larger) than observed in your study. If the null hypothesis is true, there is a 5% chance of randomly selecting subjects such that the trend is statistically significant.

However, many scientific studies generate more than one P value. Some studies in fact generate hundreds of P values. Interpreting multiple P values can be difficult.

If you test several independent null hypotheses, and leave the threshold at 0.05 for each comparison, there is greater than a 5% chance of obtaining at least one "statistically significant" result by chance. The second column in the table below shows you how much greater.

Number of Independent Null Hypotheses
Probability  of obtaining one or more P values less than 0.05 by chance
Threshold to keep overall risk of type I error equal to 0.05
1
5%
0.0500
2
10%
0.0253
3
14%
0.0170
4
19%
0.0127
5
23%
0.0102
6
26%
0.0085
7
30%
0.0073
8
34%
0.0064
9
37%
0.0057
10
40%
0.0051
20
64%
0.0026
50
92%
0.0010
100
99%
0.0005
N
100(1.00 - 0.95^N)
1.00 - 0.95^(1/N)

Note:  "0.95^N" means 0.95 to the Nth power.

To maintain the chance of randomly obtaining at least one statistically significant result at 5%, you need to set a stricter (lower) threshold for each individual comparison. This is tabulated in the third column of the table. If you only conclude that a difference is statistically significant when a P value is less than this value, then you'll have only a 5% chance of finding any "significant" difference by chance among all the comparisons.

For example, if you test three null hypotheses and use the traditional cutoff of alpha=0.05 for declaring each P value to be significant, there would be a 14% chance of observing one or more significant P values, even if all three null hypotheses were true. To keep the overall chance at 5%, you need to lower the threshold for significance to 0.0170.

If you compare three or more groups, account for multiple comparisons using post tests following one-way ANVOA. These methods account both for multiple comparisons and the fact that the comparisons are not independent. See How post tests work.

You can only account for multiple comparisons when you know about all the comparisons made by the investigators. If you report only "significant" differences, without reporting the total number of comparisons, others will not be able to properly evaluate your results. Ideally, you should plan all your analyses before collecting data, and then report all the results.

Distinguish between studies that test a hypothesis from studies that generate a hypothesis. Exploratory analyses of large databases can generate hundreds of P values, and scanning these can generate intriguing research hypotheses. To test these hypotheses, you'll need a different set of data.