| Interpreting the Wilcoxon rank sum test
How the Wilcoxon rank sum test works
A Wilcoxon rank sum test compares the median of a single column of numbers against a hypothetical median you entered.
Prism follows these steps:
Calculate how far each value is from the hypothetical median.
Ignore values that exactly equal the hypothetical value. Call the number of remaining values N.
Rank these distances, paying no attention to whether the values are higher or lower than the hypothetical value.
For each value that is lower than the hypothetical value, multiply the rank by negative 1.
Sum the positive ranks. Prism reports this value.
Sum the negative ranks. Prism also reports this value.
Add the two sums together. This is the sum of signed ranks, which Prism reports as W.
If the data really were sampled from a population with the hypothetical mean, you'd expect W to be near zero. If W (the sum of signed ranks) is far from zero, the P value will be small. The P value answers this question: Assuming that you randomly sample N values from a population with the hypothetical median, what is the chance that W will be as far from zero (or further) than you observed?
Don't confuse the Wilcoxon rank sum test (compare one group with hypothetical median) with the Wilcoxon matched pairs test (compare medians of two paired groups). See The results of a Wilcoxon matched pairs test.
With small samples, Prism computes an exact P value. With larger samples, Prism uses an approximation that is quite accurate.
How to think about the results of a Wilcoxon rank sum test
The Wilcoxon signed rank test is a nonparametric test that compares the median of one column of numbers to a theoretical median.
Look first at the P value, which answers this question: If the data were sampled from a population with a median equal to the hypothetical value you entered, what is the chance of randomly selecting N data points and finding a median as far (or further) from the hypothetical value as observed here?
If the P value is small, you can reject the idea that the difference is a coincidence, and conclude instead that the population has a median distinct from the hypothetical value you entered.
If the P value is large, the data do not give you any reason to conclude that the overall median differs from the hypothetical median. This is not the same as saying that the medians are the same. You just have no compelling evidence that they differ. If you have small samples, the Wilcoxon test has little power. In fact, if you have five or fewer values, the Wilcoxon test will always give a P value greater than 0.05 no matter how far the sample median is from the hypothetical median.
Checklist. Is the Wilcoxon test right for these data?
Before interpreting the results of any statistical test, first think carefully about whether you have chosen an appropriate test. Before accepting results from a Wilcoxon test, ask yourself these questions (Prism cannot help you answer them):
| Question |
Discussion |
| Are the "errors" independent? |
The term "error" refers to the difference between each value and the group median. The results of a Wilcoxon test only make sense when the scatter is random that any factor that causes a value to be too high or too low affects only that one value. Prism cannot test this assumption. See The need for independent samples. |
| Are the data clearly sampled from a non-Gaussian population? |
By selecting a nonparametric test, you have avoided assuming that the data were sampled from a Gaussian distribution. But there are drawbacks to using a nonparametric test. If the populations really are Gaussian, the nonparametric tests have less power (are less likely to give you a small P value), especially with small sample sizes. Furthermore, Prism (along with most other programs) does not calculate confidence intervals when calculating nonparametric tests. If the distribution is clearly not bell-shaped, consider transforming the values (perhaps with logs or reciprocals) to create a Gaussian distribution and then using a one-sample t test. |
| Are the data distributed symmetrically? |
The Wilcoxon test does not assume that the data are sampled from a Gaussian distribution. However it does assume that the data are distributed symmetrically around their median. |
| If you chose a one-tail P value, did you predict correctly? |
If you chose a one-tail P value, you should have predicted which group has the larger median before collecting data. Prism does not ask you to record this prediction, but assumes it is correct. If your prediction was wrong, ignore the P value reported by Prism and state that P>0.50. See One- vs. two-tail P values. |
|