## Please enable JavaScript to view this site.

 The false discovery rate and statistical signficance

Interpreting low P values is not straightforward

Imagine that you are screening drugs to see if they lower blood pressure. You use the usual threshold of P<0.05 as defining statistical significance. Based on the amount of scatter you expect to see and the minimum change you would care about, you've chosen the sample size for each experiment to have 80% power to detect the difference you are looking for with a P value less than 0.05.

If you do get a P value less than 0.05, what is the chance that the drug truly works?

The answer is: It depends on the context of your experiment. Let's start with the scenario where based on the context of the work, you estimate there is a 10% chance that the drug actually has an effect. What happens when you perform 1000 experiments? Given your 10% estimate, the two column totals below are 100 and 900. Since the power is 80%, you expect 80% of truly effective drugs to yield a P value less than 0.05 in your experiment, so the upper left cell is 80. Since you set the definition of statistical significance to 0.05, you expect 5% of ineffective drugs to yield a P value less than 0.05, so the upper right cell is 45.

Drug really works

Drug really doesn't work

Total

P<0.05, “significant”

80

45

125

P>0.05, “not significant”

20

855

875

Total

100

900

1000

In all, you expect to see 125 experiments that yield a "statistically significant" result, and only in 80 of these does the drug really work. The other 45 experiments yield a "statistically significant" result but are false positives or false discoveries. The false discovery rate (abbreviated FDR) is 45/125 or 36%. Not 5%, but 36%. This is also called the False Positive Rate (FPR).

The table below, from chapter 12 of Essential Biostatistics, shows the FDR for this and three other scenarios.

Prior Probability

FDR for P<0.05

FDR for 0.045 < P < 0.050

Comparing randomly assigned groups in a clinical trial prior to treatment

0%

100%

100%

Testing a drug that might possibly work

10%

36%

78%

Testing a drug with 50:50 chance of working

50%

6%

27%

Positive controls

100%

0%

0%

Each row in the table above is for a different scenario defined by a different prior (before collecting data) probability of there being a real effect. The middle column shows the expected FDR (also called FPR) as calculated above. This column answers the question: "If the P value is less than 0.05, what is the chance that there really is no effect and the result is just a matter of random sampling?". Note this answer is not 5%. The FDR is quite different than alpha, the threshold P value used to define statistical significance.

The right column, determined by simulations, asks a slightly different question based on work by Colquhoun(1,2).: "If the P value is just a little bit less than 0.05 (between 0.045 and 0.050), what is the chance that there really is no effect and the result is just a matter of random sampling?" These numbers are much higher. Focus on the third row where the prior probability is 50%. In this case, if the P value is just barely under 0.05 there is a 27% chance that the effect is due to chance. Note: 27%, not 5%! And in a more exploratory situation where you think the prior probability is 10%, the false discovery rate for P values just barely lower than 0.05 is 78%. In this situation, a statistically significant result (defined conventionally) means almost nothing.

Bottom line:  You can't interpret statistical significance (or a P value) in a vacuum. Your interpretation depends on the context of the experiment. The false discovery rate can be much higher than the value of alpha (usually 5%). Interpreting results requires common sense, intuition, and judgment.

Reference

1.Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science, 1(3), 140216–140216. http://doi.org/10.1098/rsos.140216

2.Colquhoun, D (2019). The False Positive Risk: A Proposal Concerning What to Do About p-Values. The American Statistician, Volume 73, supplement 1.