Simulations and script to assess confidence intervals

Print this Topic

Meaning of 95% confidence

When you fit a curve with nonlinear regression, one of the most important set of results are the 95% confidence intervals of the parameters. These intervals are computed from the standard errors which are based on some mathematical simplifications. They are called "asymptotic" or "approximate" standard errors. They are calculated assuming that the equation is linear, but are applied to nonlinear equations. This simplification means that the intervals can be too optimistic.

How can you know whether the intervals really do have 95% confidence? There is no general way to answer this. But for any particular situation, you can get an answer using simulations.

Combining simulations and scripting: Monte Carlo analyses

1. Create a simulation analysis, to generate data similar to the data you plan to collect in your experiment, with reasonable choices for the range of X values, spacing of X values, number of replicates, and amount of scatter.
2.Fit the simulated data with nonlinear regression, choosing the appropriate model. In the Diagnostics tab, check the option to report the 95% confidence intervals of the parameters, and choose "separate lower and upper limits" (rather than a range).
3.Make sure the simulation is the first results sheet, and the nonlinear regression is the second. Create a column data table, and move it up to the top of the list of data tables (if there are others).
4.Run the script listed below. The first line sets things up so the Wtable commands write to the first data table in the project. The script loops 1000 times, each time regenerating the random scatter in the simulated data, recalculating the nonlinear regression results (by going to that results page), and writing out the two confidence limits (in the A column of rows 14 and 19) in the Wtable commands. You will have to change the row numbers to match the confidence limits you want to record, and change the titles ("lowerKM") accordingly.

Table Prism 1 Clear

ForEach 1000

 GoTo R 1

 Regenerate

 GoTo R 2 

 Wtable "lowerKM", 14,1

 Wtable "uperKM", 19,1

Next

 

5.Go to the first data table. Click Analyze, and choose Transform. Enter this user defined Y transform, and define the TrueValue in the dialog (the value used when simulating the data in step 1). The first line says that when transforming the Y values in column A, set the result equal to 0 if Y is less than TrueValue, and otherwise set Y equal to 1. The second line applies similar logic to column B.

<A>Y=IF(Y<TrueValue, 0,1)

<B>Y=IF(Y>TrueValue,0,1)

 

6.From the results table, click Analyze and choose Column Statistics. Accept the default choices. On the results page, note the bottom row, which is the sum of the columns.

The results table records whether the confidence intervals include the true value. Each row is from one simulated data set. Column A is 0 when that confidence interval started below the real value, and otherwise is 1. Column B is 0 when the confidence interval ended above the true value, and otherwise is 1. The number of 1's in both columns is the number of confidence intervals that do not include the true value. If the 95% confidence intervals are correct, and you ran 1000 simulations then you expect about 25 1 values in column A (the confidence interval started too high) and about 25 1 values in column B (the confidence interval ended too low) for a total of 50 intervals that did not include the true value (5%, leaving 95% that did include the true value). If the value is far from 5%, then you should distrust the confidence interval for that parameter.

Of course, you can obtain more precise answers by using more simulated data sets. Change the value in the second line of the script from 1000 to some larger value.

 

 



Copyright (c) 2007 GraphPad Software Inc. All rights reserved.
URL: http://www.graphpad.com/help/Prism5/Prism5Help.html?reg_simulations_and_script_to_asse.htm