KNOWLEDGEBASE - ARTICLE #711

How can R2 be negative?

R2 is computed from the sum of the squares of the distances of the points from the best-fit curve determined by nonlinear regression or the best-fit line determined by linear regression (in which case it is called r2 using lower case). This sum-of-squares value is the sum of squares of the residuals, so is called SSres and is expressed is in the units of the Y-axis squared. To turn R2 into a fraction, the results are normalized to the sum of the square of the distances of the points from a horizontal line through the mean of all Y values. This value is called SStot. If the curve fits the data well, SSres will be much smaller than SStot.  R2 is calculated using this equation.

R2= 1 - SSres/SStot

 

Appearances can be deceptive. R2 is not really the square of anything. If SSres is larger than SStot, R2 will be negative (see equation above). While it is surprising to see something called "squared" have a negative value, it is not impossible (since R2 is not actually the square of R). 

How can this happen? SSres is the sum of the squares of the vertical distances of the points from a curve (or line). SStot is the sum of the squares of the vertical distances of the points from a horizontal line drawn at the mean Y value. SSres will exceed SStot when the line or curve fits the data even worse than does a horizontal line. 

R2 will be negative when the  line or curve does an awful job of fitting the data. This can happen when you fit a poorly chosen model (perhaps by mistake, or perhaps because the model was fit to a different data set), or .when you apply constraints to the model that don't make any sense (perhaps you entered a positive number when you intended to enter a negative number), or  For example, if you constrain the Hill slope of a dose-response curve to be greater than 1.0, but the curve actually goes downhill (so the Hill slope should be negative), you might end up with a negative R2 value and nonsense values for the parameters.  

 

Below is a simple example. The blue line is the fit of a straight line constrained to intercept the Y axis at Y=150 when X=0. SSres is the sum of the squares of the distances of the red points from this blue line. SStot is the sum of the squares of the distances of the red points from the green horizontal line. Since Sres is much larger than SStot, the R2 (for the fit of the blue line) is negative. This is because the constraint -- that the Y intercept equals 150 -- makes no sense with these data.

 

If R2 is negative, check that you picked an appropriate model, and set any constraints correctly.

Prism file

Explore the Knowledgebase

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.