curvefit.com. Guide to nonlinear regression.Try our software free for 30 days.StatMate leads you step by step through power and sample size calculations.InStat is a less cumbersome alternative to typical heavy-duty statistical programs. With InStat, even a statistical novice can analyze data in just a few minutes.Prism is a powerful combination of basic biostatistics, curve fitting and scientific graphing in one comprehensive program.GraphPad Software. Data analysis and biostatistics resources.


spa

Table of contents
Intro to regression
Nonlinear regression
Curve fitting with Prism
Interpreting the results


s

Questions
Are results sensible?
SE and CI
Goodness of fit
Systematic deviation
Local minimum
Assumptions
Common errors
Comparing two curves
Distributions of best-fit values
Radioligand binding
Saturation binding
Competitive binding
Kinetics of binding
Dose-response curves
Enzyme kinetics
Standard curves
More information
Search curvefit.com


curvefit.com was created by GraphPad Software, Inc. Send comments or questions to the author of these pages, Dr. Harvey Motulsky, president of GraphPad Software.

In April 2003, GraphPad released Prism 4 and published Fitting Models to Biological Data using Linear and Nonlinear Regression. This book includes all the information that comprises curvefit.com, and much more. You can read this book as a pdf file.

Does the curve systematically deviate from the data?

If you've picked an appropriate model, and your data follow the assumptions of nonlinear regression, the data will be randomly distributed around the best-fit curve. You can assess this in three ways:

   The distribution of points around the curve should be Gaussian.
   The average distance of points from the curve should be the same for all parts of the curve (unless you used weighting).
   Points should not be clustered. Whether each point is above or below the curve should be random.

Residuals and runs help you evaluate whether the curve deviates systematically from your data.  

Residuals from nonlinear regression

A residual is the distance of a point from the curve. A residual is positive when the point is above the curve, and is negative when the point is below the curve. The residual table has the same X values as the original data, but the Y values are the vertical distances of the point from the curve.

If you selected the residuals output option, Prism creates a graph of the residuals. An example is shown below. If you look carefully at the curve on the left, you'll see that the data points are not randomly distributed above and below the curve. There are clusters of points at early and late times that are below the curve, and a cluster of points at middle time points that are above the curve. This is much easier to see on the graph of the residuals in the inset. The data are not randomly scattered above and below the X-axis.

Runs test from nonlinear regression

The runs test determines whether the curve deviates systematically from your data. A run is a series of consecutive points that are either all above or all below the regression curve. Another way of saying this is that a run is a consecutive series of points whose residuals are either all positive or all negative.

If the data points are randomly distributed above and below the regression curve, it is possible to calculate the expected number of runs. If there are Na points above the curve and Nb points below the curve, the number of runs you expect to see equals [(2NaNb)/(Na+Nb)]+1. If you observe fewer runs than expected, it may be a coincidence or it may mean that you picked an inappropriate regression model and the curve systematically deviates from your data. The P value from the runs test answers this question: If the data really follow the model you selected, what is the chance that you would obtain as few (or fewer) runs as observed in this experiment?

The P values are always one-tail, asking about the probability of observing as few runs (or fewer) than observed. If you observe more runs than expected, the P value will be higher than 0.50.

If the runs test reports a low P value, conclude that the data don't really follow the equation you have selected.

In the example above, you expect 21 runs. There are 13 runs, and the P value for the runs test is 0.0077. If the data were randomly scattered above and below the curve, there is less than a 1% chance of observing so few runs. The data systematically deviate from the curve. Most likely, the data were fit to the wrong equation.

Testing whether the residuals are Gaussian

Nonlinear regression (as well as linear regression) assumes that the distribution of residuals follows a Gaussian distribution. You can test this assumption with Prism.

1. From the nonlinear regression parameters dialog, check the option box to show a table of residuals. If you have already fit your data, you can go back to this dialog by pressing Change Parameters from your results.

2. Go to the nonlinear regression results. Drop the list of results views (middle of third row of toolbar) and choose Residuals.

3. Press Analyze. Choose Column statistics from the list of statistical analyses.4. On the column statistics parameters dialog, check the option box to test whether the distribution is Gaussian. Look at the results of the normality test on the column statistics results page. If the P value is small, the data fail the normality test.

Could the fit be a local minimum?


All contents copyright © 1999 by GraphPad Software, Inc. All rights reserved.