Go to the GraphPad home page

The GraphPad Guide to
Nonlinear Regression
Dr. Harvey Motulsky, President, GraphPad Software
Copyright © 1995 – 2001 by GraphPad Software, Inc. All rights reserved.


Download the .pdf (printer-ready file) file now.

Part 1. Introduction to nonlinear regression

Nonlinear regression is a powerful tool for analyzing scientific data, especially in pharmacology and physiology. Because it is a topic ignored by most statistics books, we explain nonlinear regression in more detail than the other analyses performed by Prism. This chapter explains the principles of nonlinear regression, the next explains how easily you can perform nonlinear regression with Prism, and the following chapter helps you interpret the results.

The goal of nonlinear regression is to fit a model to your data. The program finds the best-fit values of the variables in the model (perhaps rate constants, affinities, receptor number, etc.) which you can interpret scientifically. In most cases, the primary goal is to obtain those values and a secondary goal is to draw a graph of the fit curve.

In some situations, you only goal is to draw a curve. You don't care about models or equations, and don't want to obtain best-fit values. You just want a smooth curve through your points either for artistic reasons or to use as a standard curve. You may still use nonlinear regression in these situations, or you may use these alternatives:

  • Polynomial regression.
  • Cubic spline or LOWESS curve.
  • A program that fits your data to thousands of equations and picks the best.

This chapter (and the next two) assumes that your goal is primarily to obtain the best-fit values of the variables -- to fit a model to your data.

A note on terminology. A model is a formal presentation of a chemical or physiological idea. To be useful for nonlinear regression, the model must be expressed as an equation that defines Y, the outcome you measure, as a function of X and one or more variables that you want to fit. We use the term variable to refer to the terms in the equation you want to fit. In the context of nonlinear regression, the term variable does not refer to X and Y. Some programs and books use the word parameters rather than variables.


Why you should use nonlinear regression

Linear regression of transformed data is less accurate

Before the age of microcomputers, nonlinear regression was not readily available to most scientists. Instead, scientists transformed their data to make a linear graph, and then analyzed the transformed data with linear regression. Examples include Lineweaver-Burke plots of enzyme kinetic data, Scatchard plots of binding data, and logarithmic plots of kinetic data.

These methods are outdated, and should not be used to analyze data. The problem is that the linear transformation distorts the experimental error. Linear regression assumes that the scatter of points around the line follows a Gaussian distribution and that the standard deviation is the same at every value of X. These assumptions are usually not true with the transformed data. A second problem is that some transformations alter the relationship between X and Y. For example, in a Scatchard plot the value of X (bound) is used to calculate Y (bound/free), and this violates the assumptions of linear regression.

Since the assumptions of linear regression are violated, the results of linear regression are incorrect. The values derived from the slope and intercept of the regression line are not the most accurate determinations of the variables in the model. Considering all the time and effort you put into collecting data, you want to use the best possible analysis technique. Nonlinear regression produces the most accurate results.

This figure shows the problem of transforming data. The left panel shows data that follows a rectangular hyperbola (binding isotherm). The right panel is a Scatchard plot of the same data. The solid curve on the left was determined by nonlinear regression. The solid line on the right shows how that same curve would look after a Scatchard transformation. The dotted line shows the linear regression fit of the transformed data. The transformation amplified and distorted the scatter, and thus the linear regression fit does not yield the most accurate values for Bmax and Kd.

Transformations can be very useful when used appropriately. When analyzing data, follow these rules:

  • You should transform your data when the transformation makes the variability more consistent and more Gaussian.
  • You should not transform data when the transformation makes the variability less consistent and less Gaussian.
  • You should not perform transforms (such as the Scatchard transform) that destroy the relationship between X and Y.
  • You should not transform the data merely to make it linear. Since nonlinear regression is easy, there is no reason to force your data into a linear form.

Although it is usually inappropriate to analyze transformed data, it is often helpful to display data after a linear transform. Many people find it easier to visually interpret transformed data. This makes sense because the human eye and brain evolved to detect edges (lines) - not to detect rectangular hyperbolas or exponential decay curves. Even if you analyze your data with nonlinear regression, it may make sense to display transformed data.

Don't relegate scientific decisions to a computer program

The goal of nonlinear regression is to fit a model to your data. The program finds the best-fit values of the variables in the model (perhaps rate constants, affinities, receptor number, etc.) which you can interpret scientifically. Choosing a model is a scientific decision. You should base your choice on your understanding of chemistry or physiology (or genetics, etc.). The choice should not be based solely on the shape of the graph.

Some programs (not available from GraphPad) automatically fit data to hundreds or thousands of equations and then present you with the equation(s) that fit the data best. Using such a program is appealing because it frees you from the need to choose an equation. The problem is that the program has no understanding of the scientific context of you experiment. The equations that fit the data best are unlikely to correspond to scientifically meaningful models. You will not be able to interpret the best-fit values of the variables, and the results are unlikely to be useful for data analysis.

This kind of approach is very useful in three situations:

  • Your only goal is to plot an attractive curve.
  • You wish to create a standard curve for interpolating unknown values.
  • You need an equation to use within a computer simulation.
  • In all three situations, it doesn't matter whether the equation corresponds to a biological, chemical or physical model. What matters is that the equation accurately predict Y from X within the range of your data.

    This approach can be useful in some situations. Don't use it when the goal of curve fitting is to fit the data to a model based on chemical, physical, or biological principles. Don't use a computer program to avoid making a scientific decision.

    The results of polynomial regression are often impossible to interpret scientifically

    Beware of the term "curve fitting". The term is often used to refer not to nonlinear regression, but rather to polynomial regression. This method fits data to a polynomial equation: Y=A + BX + CX2 + DX3..... Programmers prefer polynomial regression, because it is so much easier to program. That's why it is built in to so many spreadsheet and graphics programs. But few biological or chemical models are described by polynomial equations, so polynomial regression is of limited usefulness to scientists.

    Cubic spline is not a data analysis method

    Cubic spline curves are smooth curves that go through every data point. In some cases, a cubic spline curve can look attractive on a graph and work well as a standard curve for interpolation. The curve does not correspond to any equation (or rather the equation differs for every pair of points) so cubic spline is not useful in data analysis.


    How nonlinear regression works

    Comparison of linear and nonlinear regression

    A line is described by a simple equation that calculates Y from X, slope and intercept. The purpose of linear regression is to find values for the slope and intercept that define the line that comes closest to the data. More precisely, it finds the line that minimizes the sum of the square of the vertical distances of the points from the line.

    The goal of minimizing the sum-of-squares in linear regression can be achieved quite simply. A bit of algebra (shown in many statistics books) derives equations that define the slope and intercept. Put the data in, and the answers come out. There is no chance for ambiguity.

    Nonlinear regression is more general. It can fit data to any equation that defines Y as a function of X and one or more variables. It finds the values of those variables that generate the curve that comes closest to the data. More precisely, the goal is to minimize the sum of the squares of the vertical distances of the points from the curve.

    Except for a few special cases, it is not possible to directly solve the equation to find the values of the variables that minimize the sum-of-squares. Instead nonlinear regression requires an iterative approach.

    Iterations in nonlinear regression

    Here are the steps that every nonlinear regression program follows:

    1. Start with an initial estimated value for each variable in the equation.
    2. Generate the curve defined by the initial values. Calculate the sum-of-squares (the sum of the squares of the vertical distances of the points from the curve).
    3. Adjust the variables to make the curve come closer to the data points. There are several algorithms for adjusting the variables. The most commonly used method was derived by Levenberg and Marquardt (often called simply the Marquardt method).
    4. Adjust the variables again so that the curve comes even closer to the points.
    5. Keep adjusting the variables until the adjustments make virtually no difference in the sum-of-squares.
    6. Report the best-fit results. The precise values you obtain will depend in part on the initial values chosen in step 1 and the stopping criteria of step 5. This means that repeat analyses of the same data will not always give exactly the same results.

    Decisions you need to make when fitting curves with nonlinear regression

    When you use a program for nonlinear regression, you must make the following decisions.

    Choose a model

    To use nonlinear regression, you must first define a mathematical model based on theory. The first step is to choose a model. For example, many kinds of binding data are explained by the law of mass action. The next step is to express the model as an equation defines Y as a function of X and one or more variables. Some programs (not Prism) also let you define the model as a differential equation that defines dY/dX as a function of one or more variables.

    Choosing a model is a scientific decision, not a statistical one. The model needs to make sense in scientific terms.

    You may also fit two different models to your data, and then use statistical methods (F test) to compare them.(discussed in Part 2.).

    Prepare data for nonlinear regression

    When preparing data for nonlinear regression, keep these points in mind:

  • It matters which variable is X and which is Y. X should be the variable you control or manipulate. Y is the variable you measure. Nonlinear regression finds the curve that lets you best predict Y from X.
  • Use reasonable units. In pure mathematics, it doesn't matter whether you express your results as 1 picomolar or 10-12 molar, as 1 nanovolt or 10-9 volts. When computers do the calculating, however, it can matter. Calculation problems such as round off errors are far more likely when the values are very high or very low. We recommend that you scale your data to avoid values less than 10-4 or greater than 104.
  • Don't smooth. You lose information when you smooth data, and this won't get you a better fit.
  • If you are fitting data to a sigmoidal dose-response or competitive binding curve, enter the X values as the logarithm of concentration, rather than the concentration itself.
  • Estimate initial values

    Nonlinear regression is an iterative procedure. The program must start with estimated values for each variable that are in the right "ball park" -- say within a factor of five of the actual value. It then adjusts these initial values to improve the fit. It then adjusts the values again and again until the improvement is tiny.

    If you have "clean" data that clearly define a curve, then it usually doesn't matter if the initial values are fairly far from the correct values. You'll get the same answer no matter what initial values you use, unless the initial values are very far from correct.

    Initial values matter more when your data have a lot of scatter, don't span a large enough range of X values to define a full curve, or don't really fit the model. In these cases, you may get different answers depending on which initial values you use. (False minima are discussed in Part 2.).

    You'll find it easy to estimate initial values if you have looked at a graph of the data, and understand the model and what all the variables mean. Remember, you just need an estimate. It doesn't have to be very accurate. If you are having problems estimating an initial value:

  • Check that you have chosen a model that makes scientific sense.
  • Make sure you understand what each variable in the equation means.
  • Put away your data, and spend an hour or two generating curves using the model. Change the variables one at a time, and see how they influence the shape of the curve.
  • Prism automatically provides initial values if you choose a built-in equation. If you use a user-defined equation, you can define rules for obtaining initial values from the range of the X and Y values. Once you define these rules, Prism will automatically determine the initial values in the future.

    Constants

    You don't have to fit every variable in the equation. In many situations it makes sense to fix some of the variables to constant values. For example, you might want to define the bottom plateau of a dose-response curve or an exponential decay curve to equal zero.

    Weighting

    In general, the goal of nonlinear regression is to find the values of the variables in the model that make the curve come as close as possible to the points. Usually this is done by minimizing the sum of the squares of the vertical distances of the data points from the curve. This is appropriate when you expect that the scatter of points around the curve is Gaussian and unrelated to the Y values of the points. (Note to those who have studied advanced statistics: If those assumptions are true, minimizing the sum-of-squares is equivalent to finding the maximum likelihood estimate of the variables).

    With many experimental protocols, you don't expect the experimental scatter to be the same, on average, for all points. Instead, you expect the experimental scatter to be a constant percentage of the Y value. If this is the case, points with high Y values will have more scatter than points with low Y values. When the program minimizes the sum of squares, points with high Y values will have a larger influence while points with smaller Y values will be relatively ignored. You can get around this problem by minimizing the sum of the square of the relative distances. This procedure is termed weighting the values by 1/Y2. Because it prevents large points from being over-weighted, the term unweighting seems more intuitive.

    It is also possible to weight the data in other ways. The goal, always, is to end up with a measure of goodness-of-fit that weights all the data points equally.

    Average replicates?

    If you collected replicate Y values at every value of X, there are two ways to analyze the data:

  • Treats each replicate as a separate point.
  • Average the replicate Y values, and treat the mean as a single point.
  • Deciding which approach to use can be difficult.

    The advantage of the first approach is that you have more data points and thus more degrees of freedom. However, you should only use that approach when the experimental error of each replicate is no more closely related to the other replicates than to other data points. Here are two examples where you should analyze each replicate:

  • You are doing a radioligand binding experiment. All the data were obtained from one tissue preparation and each replicate was determined from a separate incubation (separate test tube). The sources of experimental error are the same for each tube. If one value happens to be a bit high, there is no reason to expect the other replicates to be high as well.
  • You are doing an electrophysiology study. You apply a voltage across a cell membrane and measure conductance. Each data point was obtained from a separate cell. The possible sources of experimental error are independent for each cell. If one cell happens to have a high conductance, there is no reason to expect the replicate cells (those that you apply the same voltage to) to also have high conductance.
  • You should not treat each replicate as a separate point when the experimental error of the replicates are related. You should average the replicates instead, and analyze the averages. Here are two examples where you should average the replicates:

  • The experiment was only performed with a single replicate at each value of X, and you measure radioactivity as Y. Each tube is counted three times, and the three counts are treated as replicates. Any experimental error while conducting the experiment would appear in all the replicates. The replicates are not independent.
  • The experiment is a dose-response curve. At each dose, you use a different animal but measure the response three times. The three measurements are not independent. If an animal happens to respond more than the others, that will affect all the replicates. The replicates are not independent.
  • Go to Part 2: Interpreting nonlinear regression results


    GraphPad Home