The advantages of centered polynomial regression
New to Prism 5.02 (Windows) and 5.0b (Mac) is a set of centered polynomial equations. For example, when you look in the list of polynomials you'll see both 'Second order polynomial' and 'Centered second order polynomial'. We recommend always choosing one of the centered equations instead of an ordinary polynomial equation. This page explains why.
What's wrong with ordinary polynomial models?
The standard polynomial models look like this:
More terms are included with the higher order equations. There are two problems with polynomial fits:
- When the X values are large, and start well above zero (for example, when X is a calendar year), taking the very large X values to large powers can lead to math overflow. Even if the program doesn't report any math error, the results can be inaccurate. Some coefficients will be positive and some negative, so the value of Y depends on subtracting huge numbers from other huge numbers, leading to imprecise results.
- Even when the X values are not large, the parameters of the model are intertwined, so have high covariance and dependency. This results in large standard errors, wide confidence intervals, and huge confidence or prediction bands. In many cases, this problem is severe enough that Prism reports that the results are 'ambiguous' and so doesn't report confidence intervals for all the parameters and can't graph confidence bands.
What are centered polynomial models?
Both problems go away when the X values are centered. The idea of centering is to subtract the mean X from all X values before fitting the model. This can be done as part of nonlinear regression, using this model:
XC = X - Xmean
Here XC is the centered X value, equal to the X value minus Xmean, which is the mean of all X values. In other words, XC is the distance of any X value from the mean of all X values. Xmean is constant, and not a parameter that Prism tries to fit. Of course, you can include more terms in the definition of Y to create higher order polynomial equations.
The advantages of centered models
Fitting the centered model leads to exactly the same curve (unless the regular approach led to math errors). Accordingly, the sum-of-squares is the same, as are results of model comparisons.
However, the centered equation has reparameterized the model. The parameters have different meanings, so have different best-fit values (except the first parameter which is the same), different standard errors and confidence intervals, smaller covariances and dependencies, and tighter confidence/prediction bands.
How centered models are implemented in Prism
Prism 5.02 and 5.0b include a set of centered polynomial equations as part of the built-in set of polynomial equations. You can fit data to these without knowing how Prism implements the model. If you are curious, read on.
Prism 5.02 and 5.0b offer a new choice when constraining a parameter of an equation used in nonlinear regression, "Data set contant (= Mean X)". The built-in set of centered polynomial equations, written as shown above, use this new feature to constrain the parameter XMean to equal the mean of X value. Note that if you open a file using centered polynomial regression in an older version of Prism, that constraint will be lost.