Contents

Statistical principles:

The need for statistics

Sample vs population

Gaussian distribution

Confidence intervals

P values

Statistical significance

Power

Bayes

Multiple comparisons

Analyzing one group

Analyzing two groups

Analysis of variance (ANOVA)

Analyzing survival data

Categorical data
(contingency tables)

Correlation & linear regression

Our Products...
Prism
InStat
StatMate
Intuitive Biostatistics


The Prism Guide to Interpreting Statistical Results
This guide is excerpted from Analyzing Data with GraphPad Prism, a book that accompanies the program GraphPad Prism. Browse this guide using the Contents navigation on the left. You may also download the entire book.

A Bayesian perspective on interpreting statistical significance

From Analyzing Data with GraphPad Prism. Copyright 1999 by GraphPad Software, Inc.  All Rights Reserved.

Imagine that you are screening drugs to see if they lower blood pressure. Based on the amount of scatter you expect to see, and the minimum change you would care about, you've chosen the sample size for each experiment to have 80% power to detect the difference you are looking for with a P value less than 0.05. What happens when you repeat the experiment many times?

The answer is "it depends". It depends on the context of your experiment. Let's look at the same experiment performed in three contexts. First, we'll assume that you know a bit about the pharmacology of the drugs, and expect 10% of the drugs to be active. In this case, the prior probability is 10%. Second, we'll assume you know a lot about the pharmacology of the drugs, and expect 80% to be active. Third, we'll assume that the drugs were selected at random, and you expect only 1% to be active in lowering blood pressure.

What happens when you perform 1000 experiments in each of these contexts. The details of the calculations are shown on pages 143-145 of Intuitive Biostatistics, by Harvey Motulsky (Oxford University Press, 1995). Since the power is 80%, you expect 80% of truly effective drugs to yield a P value less than 0.05 in your experiment. Since you set the definition of statistical significance to 0.05, you expect 5% of ineffective drugs to yield a P value less than 0.05. Putting these calculations together creates these tables.

A. Prior probability=10% Drug really works Drug really doesn't work Total
P<0.05 "significant"
80
45
125
P>0.05, "not significant"
20
855
875
Total
100
900
1000

B. Prior probability=80% Drug really works Drug really doesn't work Total
P<0.05 "significant"
640
10
650
P>0.05, "not significant"
160
190
350
Total
800
200
1000

C. Prior probability=1% Drug really works Drug really doesn't work Total
P<0.05 "significant"
8
50
58
P>0.05, "not significant"
2
940
942
Total
10
990
1000

The total for each column is determined by the prior probability - the context of your experiment. The prior probability equals the fraction of the experiments that are in the leftmost column. To compute the number of experiments in each row, use the definition of power and alpha. Of the drugs that really work, you won't obtain a P value less than 0.05 in every case. You chose a sample size to obtain a power of 80%, so 80% of the truly effective drugs yield "significant" P values, and 20% yield "not significant" P values.  Of the drugs that really don't work (middle column), you won't get "not significant" results in every case. Since you defined statistical significance to be "P<0.05" (alpha=0.05), you will see a "significant" result in 5% of experiments performed with drugs that are really inactive and a "not significant" result in the other 95%.

If the P value is less than 0.05, so the results are "statistically significant", what is the chance that the drug is really active? The answer is different for each experiment.

Experiments with P<0.05 and...
Fraction of experiments with P<0.05 where drug really works
Prior probability
...drug really works
...drug really doesn't work
A. Prior probability=10%
80
45
80/125 = 64%
B. Prior probability=80%
640
10
640/650 = 98%
C. Prior probability=1%
8
50
8/58 = 14%

For experiment A, the chance that the drug is really active is 80/125 or 64%. If you observe a statistically significant result, there is a 64% chance that the difference is real and a 36% chance that the difference was caused by random sampling. For experiment B, there is a 98.5% chance that the difference is real. In contrast, if you observe a significant result in experiment C, there is only a 14% chance that the result is real and a 86% chance that it is a coincidence of random sampling. For experiment C, the vast majority of "significant' results are due to chance.

Your interpretation of a "statistically significant" result depends on the context of the experiment. You can't interpret a P value in a vacuum. Your interpretation depends on the context of the experiment. Interpreting results requires common sense, intuition, and judgment.