## Please enable JavaScript to view this site.

 Choosing diagnostics for multiple regression

How precise are the best-fit values of the parameters?

The best fit value of each parameter is, of course, an estimate of the true value which you could only know if you had an infinite amount of data. The precision of each parameter can be expressed as a standard error and/or a confidence interval. Standard errors are more conventional. Confidence intervals are easier to interpret.

You can also ask Prism to test the statistical significance of each parameter. The null hypothesis is that the true population value of the parameter is 0.0. The P value answers the question: If that null hypothesis were true, what is the chance that analysis of a randomly selected data sample will result in a parameter as far, or further, from zero than reported from this analysis? Prism also reports whether the result is "statistically significant" defined as a P value less than 0.05.

Only choose to compute significance when you really wonder if the true parameter value isn't zero (so that variable has no impact on the model). This will not be the case for many parameters.

## Are the parameters intertwined or redundant?

Prism can report a parameter covariance matrix to show you how each parameter in the model is correlated with each other parameter. If the Parameter covariance matrix option is selected, Prism will generate an additional results tab with the parameter correlations, and will also generate a heatmap of these correlations. Prism can also quantify multicollinearity -- how well each variable can be predicted from the other variables.

## How to quantify goodness of fit

### If you chose least-squares regression

Multiple R, coefficient of multiple correlation,  is the correlation between the Y values and the predicted Y values. It is the square root of R2 and its value is always between 0 and 1.

R2 is the standard way to assess goodness-of-fit.

The adjusted R2 takes into account the number of parameters fit to the data, so has a lower value than R2 (unless you fit only one parameter, in which case R2 and adjusted R2 are identical).

The sum-of-squares (or weighted sum-of-squares) is the value that Prism minimizes when it fits the model. Reporting this value is useful only if you want to compare Prism's results to those of another program, or you want to do additional calculations by hand.

Sy.x and RMSE are alternative ways to quantify the standard deviation of the residuals. We recommend the Sy.x, which is also called Se. Sy.x divides the sum-of-squares by N-K, where N is the number of rows analyzed and K is the number of parameters fit. RMSE uses N-1 in the denominator, instead of N-K.

The AICc is useful only if you separately fit the same data to three or more models. You can then use the AICc to choose between them. But note that it only makes sense to compare AICc between fits when the only difference is the model. If the data or weighting are not identical between fits, then any comparison of AICc values would be  meaningless.

### If you chose Poisson regression

If you chose Poisson regression, Prism offers three ways to quantify the goodness of fit: the pseudo-R2, the dispersion index and the model deviance. The pseudo-R2 can be interpreted pretty much like an ordinary R2. The other two values will be of interest only to those who have studied Poisson regression in depth.

## Are the residuals Gaussian ?

Least-squares regression assumes that the distribution of residuals follows a Gaussian distribution. Prism can test this assumption by running a normality test on the residuals. Prism offers four normality tests. We recommend the D'Agostino-Pearson test.