KNOWLEDGEBASE - ARTICLE #458

How is s_y.x used to interpret regression analysis?

Linear and nonlinear regression results of Prism (and many other programs) include the value s_y.x. Some other programs call this quantity s_e. It is not a particularly useful part of regression output, and if you have been ignoring it, you can keep doing so without feeling guilty.

This value quantifies the scatter of the data points around the best fit line or curves and is computed using this equation:

SS is the sum of squares of the distances of the curves from the points and df is the degrees of freedom of the fit, calculated as the number of data points (N) minus the number of parameters fit. For linear regression, df = N-2.

Since s_y.x is the standard deviation of the vertical distances of the data points from the line, it is expressed in the same units used for the Y values, and is inversely related to goodness of fit.

You can think of s_y.x very roughly as the average distance of the data from the best-fit line or curve. It is easier to interpret the value of r², which is computed from s_y.x and the standard deviation of all the Y values (without regard for the model being fit).

How is sy.x used to interpret regression analysis?

Explore the Knowledgebase

How is s_y.x used to interpret regression analysis?