﻿ How to report statistical results
GraphPad Statistics Guide

How to report statistical results

The guidelines below are an opinionated guide about how to present data and analyses. Of course, you also need to report details of experimental design, including blinding and randomization.

Overall

Every statistical paper should report all methods (including those used to process and analyze the data) completely enough so someone else could reproduce the work exactly.

Every figure and table should present the data clearly (and not be exaggerated in a way to emphasize your conclusion).

All the results should be reported completely enough that no one wonders what you actually did.

The analyses before the analyses

Did you decide to normalize? Remove outliers? Transform to logarithms? Smooth? Remove a baseline? Justify these decisions, and report enough details so anyone could start with your data and get exactly the same results. State whether these calculations were preplanned or only decided upon after seeing the data.

If outliers were eliminated, say how many there were, what criteria you used to identify them, and whether these criteria were  chosen in advance as part of the experimental design.

Sample size

Report how you chose sample size.

Explain exactly what was counted when reporting sample size. When you say n=3, do you mean three different animals, three different assays on tissue from one animal, one assay from tissue pooled from three animals, three repeat counts in a gamma counter from a preparation made from one run of an experiment...?

State whether you choose sample size in advance, or adjusted sample size in an ad hoc manner as you saw the results accumulate.

If the sample sizes of the groups are not equal, explain why.

Avoid P-hacking

For each analysis (usually for each figure and table), state whether every step in data analysis followed a preplanned protocol or not. If you only decided to remove outliers after seeing the data, say so. If you only decided to use a nonparametric test after seeing the data, say so. If you only decided to analyze the logarithms of the data after viewing the data, say so.

If you don't show every analysis you did, at least describe and enumerate them.

If you started with one sample size and ended with another sample size, explain exactly how you decided to add additional samples and/or eliminate samples. State whether these decisions were based on a preset protocol, or were decided during the course of the experiment.

Graphing data

Present data clearly. Focus on letting the reader see the data, and not only your conclusions.

When possible, graph the individual data, not a summary of the data. If there are too many values to show in scatter plots, consider box-and-whisker plots or frequency distributions.

If you choose to plot means with error bars, graph standard deviation error bars which show variability, rather than standard error of the mean error bars, which do not.

Statistical methods

State the full name of the test. Don't say "t test", say "paired t test".

Identify the program of the program that did the calculations (including detailed version number, which for GraphPad Prism might be 7.01).

State all options you selected. Repeated measures? Correcting for unequal variances? Robust regression? Constraining parameters? Sharing parameters? Report enough detail so anyone could  start with your data and get precisely the same results you got.

Reporting effect size

The most important result of most experiments is an effect size. How big was the difference (or ratio or percent increase)? Or how strongly were two variables correlated? In almost all cases, you can summarize this effect size with a single value and should report this effect with a confidence interval, usually the 95% interval. This is by far the most important finding to report in a paper and its abstract.

Consider showing a graph of effect sizes (i.e. differences or ratios) with 95% confidence intervals.

Reporting P values

When possible, report the P value as a number with a few digits of precision, not an inequality. For example say "the P value was 0.0234" rather than "P < 0.05".

If there is any possible ambiguity, clearly state the null hypothesis the P value tests. If you don't know the null hypothesis, then you shouldn't report a P value (since every P value tests a null hypothesis)!

When comparing two groups, state if the P value is one- or two-sided (which is the same as one- or two-tailed). If one-sided, state that you predicted the direction of the effect before collecting data (and recorded this prediction), and recorded that decision and prediction. If you didn't make this decision and prediction before collecting data, you should not report a one-sided P value.

Reporting statistical hypothesis testing (significance)

Statistical hypothesis testing is used to make a firm decision based on a single P value. One use is choosing between the fit of two alternative models. If the P value is less than a preset threshold you pick one model, otherwise the other. When doing this, state both models, the method you are using to choose between them, the preset threshold P value, and the model you chose. Perhaps also report the goodness of fit of both models.

When comparing groups, you don't always make a decision based on the result. If you are making a crisp decision, report the threshold P value, whether the computed P value was greater or less than the threshold, and the accompanying decision. If you are not making a decision, report the effect with its confidence interval, and perhaps a P value. If you are not making a decision based on that P value, then it doesn't really matter whether or not the P value was less than a threshold or not, and the whole idea of statistical hypothesis testing isn't really useful.

The word "significant" has two related meanings, so has caused lots of confusion in science. The two bullet points above demonstrate that the results of statistical hypothesis testing can (and in my opinion should) be reported without using the word "significant". If you do choose to use the word "significant" in this context, always precede it with "statistically", so there is no confusion.

Never use the word "significant" when discussing the clinical or physiological impact of a result. Instead use words like "large", "substantial", and "clinically relevant". Using "significant" in this context just leads to confusion.

Multiple comparisons

Multiple comparisons must be handled thoughtfully, and all steps must be documented. Note that the problem of multiple comparisons is widespread, and isn't just an issue when doing follow-up tests after ANOVA.

State whether or not all comparisons were planned, and all planned comparisons were reported. If you report unplanned comparisons or omit some comparisons, the results must be identified as preliminary.

If you used any correction for multiple comparisons, explain the details.

If you report multiplicity adjusted P values, point out clearly that these P values were adjusted.

Other guides to presenting statistical results

1.Curtis, M. J., Bond, R. A., Spina, D., Ahluwalia, A., Alexander, S. P. A., Giembycz, M. A., et al. (2015). Experimental design and analysis and their reporting: new guidance for publication in BJP. Br J Pharmacol, 172(14), 3461–3471.

2.Altman DG, Gore SM, Gardner MJ, Pocock SJ (1983). Statistical guidelines for contributors to medical journals. Brit Med J 286: 1489–1493.