| Other kinds of regression
The difference between correlation and regression
Correlation quantifies how consistently the two variables vary together. When the two variables vary together, statisticians say that there is a lot of covariation or correlation. The direction and magnitude of correlation is quantified by the correlation coefficient, r. Correlation calculations do not find a best-fit line. See " Correlation" on page 135.
Polynomial regression
Polynomial regression fits data to this equation:

You can include any number of terms. If you stop at the second (B) term, it is called a first-order polynomial equation, which is identical to the equation for a straight line. If you stop after the third (C) term, it is called a second-order, or quadratic, equation. If you stop after the fourth term, it is called a third-order, or cubic, equation.
If you choose a second, or higher, order equation, the graph of Y vs. X will be curved (depending on your choice of A, B, C
). Nonetheless, the polynomial equation is not strictly a nonlinear equation. Holding X and the other parameters constant, a graph of any parameter (A, B, C
) vs. Y would be linear. From a mathematical point of view, the polynomial equation is linear. This means little to scientists, but it means a lot to a mathematician, because it quite easy to write a program to fit data to linear equations. Because polynomial regression is related to linear regression, you don't have to enter any initial values.
But there is a fundamental problem with polynomial regression: Few biological or chemical models are described by polynomial equations. This means that the best-fit results can rarely be interpreted in terms of biology or chemistry.
Polynomial regression can be useful to create a standard curve for interpolation, or to create a smooth curve for graphing. But polynomial regression is rarely useful for fitting a model to biological data.
To perform polynomial regression with Prism, choose the nonlinear regression analysis but pick a polynomial equation.
Multiple regression
Multiple regression fits data to a model that defines Y as a function of two or more independent (X) variables. For example, a model might define a biological response as a function of both time and concentration. The term multiple regression is usually used to mean fitting data to a linear equation with two or more X variables (X1, X2,
).

Nonlinear multiple regression models define Y as a function of several X variables using a more complicated equation. Prism cannot perform any kind of multiple regression. The companion program, GraphPad InStat, can perform basic multiple regression using the above equation.
No GraphPad programs can perform multiple nonlinear regression.
Logistic and proportional hazards regression
Linear, nonlinear and polynomial regression all fit data to models where Y is a continuous measured variable such as weight, concentration, receptor number, or enzyme activity.
If Y is a binomial outcome (for example male vs. female, pass vs. fail, viable vs. not viable) you need to use a different kind of regression, called logistic regression. Prism does not perform logistic regression. The only analysis it can do with binomial outcomes is to analyze contingency tables. By analyzing a contingency table, you can compare a binomial outcome in two or more groups. With logistic regression, you can compare outcomes after correcting for other differences between groups.
If Y is a survival time, you need to use yet another kind of regression, called proportional hazards regression. Prism can compare survival curves using the logrank test, but cannot perform proportional hazards regression. Proportional hazards regression lets you compare survival curves after correcting for other differences.
|