## Please enable JavaScript to view this site.

 Analysis checklist: Simple linear regression

### Can the relationship between X and Y be graphed as a straight line?

In many experiments the relationship between X and Y is curved, making linear regression inappropriate. It rarely helps to transform the data to force the relationship to be linear. Better, use nonlinear curve fitting.

### Is the scatter of data around the line Gaussian (at least approximately)?

Linear regression analysis assumes that the scatter of data around the best-fit line is Gaussian. In other words, it assumes that the residuals (the vertical distances of the points from the best-fit line) are sampled from a Gaussian (normal) distribution.

### Is the variability the same everywhere?

Linear regression assumes that scatter of points around the best-fit line has the same standard deviation all along the curve. The assumption is violated if the points with high or low X values tend to be further from the best-fit line. The assumption that the standard deviation is the same everywhere is termed homoscedasticity. (If the scatter goes up as Y goes up, you need to perform a weighted regression. Prism can't do this via the linear regression analysis. Instead, use nonlinear regression but choose to fit to a straight-line model.

### Do you know the X values precisely?

The linear regression model assumes that X values are exactly correct, and that experimental error or biological variability only affects the Y values. This is rarely the case, but it is sufficient to assume that any imprecision in measuring X is very small compared to the variability in Y.

### Are the data points independent?

Whether one point is above or below the line is a matter of chance, and does not influence whether another point is above or below the line. This assumption should be true about your data in order to correctly interpret your results.

Note that Prism does not currently provide a way to handle repeated measures designs (mixed-effects models). For example, an experiment with a single measurement from four different animals - each at six time points - would generate 24 total values. Prism will treat this data as if it contains 24 independent data points even though this is not the case. If, in this example, one animal had higher measurements at all time points, Prism would not account for these repeated measures and the results could be misleading.

### Are the X and Y values intertwined?

If the value of X is used to calculate Y (or the value of Y is used to calculate X) then linear regression calculations are invalid. One example is a Scatchard plot, where the Y value (bound/free) is calculated from the X value. Another example would be a graph of midterm exam scores (X) vs. total course grades(Y). Since the midterm exam score is a component of the total course grade, linear regression is not valid for these data.

© 1995-2019 GraphPad Software, LLC. All rights reserved.