Interpreting results: Correlation

Print this Topic

Correlation coefficient

The correlation coefficient, r, ranges from -1 to +1. The nonparametric Spearman correlation coefficient, abbreviated rs, has the same range.

Value of r (or rs)

Interpretation

1.0

Perfect correlation

0 to 1

The two variables tend to increase or decrease together.

0.0

The two variables do not vary together at all.

0 to -1

One variable increases as the other decreases.

-1.0

Perfect negative or inverse correlation.

 

If r or rs is far from zero, there are four possible explanations:

Changes in the X variable change the value of the Y variable.
Changes in the Y variable change the value of the X variable.
Changes in another variable influence both X and Y.
X and Y dont really correlate at all, and you just happened to observe such a strong correlation by chance. The P value quantifies the likelihood that this could occur.

If you choose Spearman nonparametric correlation, Prism computes the confidence interval of the Spearman correlation coefficient by an approximation. According to Zar (Biostatistical Analysis) this approximation should only be used when N>10. So with smaller N, Prism simple does not report the confidence interval of the Spearman correlation coefficient.

R squared

Perhaps the best way to interpret the value of r is to square it to calculate r2. Statisticians call this quantity the coefficient of determination, but scientists call it "r squared". It is a value that ranges from zero to one, and is the fraction of the variance in the two variables that is “shared”. For example, if r2=0.59, then 59% of the variance in X can be explained by variation in Y. Likewise, 59% of the variance in Y can be explained by variation in X. More simply, 59% of the variance is shared between X and Y.

Prism only calculates an r2 value from the Pearson correlation coefficient. It is not appropriate to compute r2 from the nonparametric Spearman correlation coefficient.

P value

The P value answers this question:

If there really is no correlation between X and Y overall, what is the chance that random sampling would result in a correlation coefficient as far from zero (or further) as observed in this experiment?

If the P value is small, you can reject the idea that the correlation is due to random sampling.

If the P value is large, the data do not give you any reason to conclude that the correlation is real. This is not the same as saying that there is no correlation at all. You just have no compelling evidence that the correlation is real and not due to chance. Look at the confidence interval for r. It will extend from a negative correlation to a positive correlation. If the entire interval consists of values near zero that you would consider biologically trivial, then you have strong evidence that either there is no correlation in the population or that there is a weak (biologically trivial) association. On the other hand, if the confidence interval contains correlation coefficients that you would consider biologically important, then you couldn't make any strong conclusion from this experiment. To make a strong conclusion, youll need data from a larger experiment.

If you entered data onto a one-grouping-variable table and requested a correlation matrix, Prism will report a P value for the correlation of each column with every other column. These P values do not include any correction for multiple comparisons.



Copyright (c) 2007 GraphPad Software Inc. All rights reserved.
URL: http://www.graphpad.com/help/Prism5/Prism5Help.html?how_to_think_about_results_of_linear_correlation.htm