|
Simulating data with random error |
|
Key concepts: Simulating data Simulation is an underused tool. It is a great way to understand models and plan experiments. Prism lets you combine an analysis that simulates data with a script to do so many times as a way to perform Monte Carlo analyses. Note that this analysis doesn't in fact analyze any data. Instead it generates curves from an equation. How to: Simulate XY data
How to: Simulate column data Prism can only simulate XY data. But you can simulate column data by following these steps.
How Prism generates random numbers Prism can add random values to each of the calculated Y values to simulate experimental error. The only way to generate truly random numbers is through a random physical process, such as tossing dice or measuring intervals between radioactive decays. Prism, like all computer programs, generates “random” numbers from defined calculations. Since the sequence of numbers is reproducible, mathematicians say that the numbers are “pseudo-random”. The difference between truly random and pseudo-random numbers rarely creates a problem. For most purposes, computer-generated random numbers are random enough to simulate data and test analytical methods. Prism uses the time of day when calculating the first random number, so you will get a different series of random numbers every time you run the program. Prism generates random values from a Gaussian distribution using routines adapted from ideas presented in Numerical Recipes in C, (W. H. Press et al, second edition, Cambridge Press, 1992). The function RAN3 (defined in Numerical Recipes) generates uniformly distributed random numbers and the function GASDEV transforms them to a Gaussian distribution with a mean of zero and a standard deviation you enter. If you choose relative error, Prism first calculates a random number from a Gaussian distribution with a mean of zero and with a SD equal to the percent error you enter. It then multiplies that percentage times the ideal Y value to yield the actual random value that is added to the Y value. When the Y values represent the number of objects you would observe in a certain space, or the number of events you would observe in a certain time interval, choose random numbers from a Poisson distribution. Again, our method is based on ideas from Numerical Recipes. Prism also can generate random numbers from a t distribution with any number of degrees of freedom (df). This lets you simulate wider scatter than Gaussian. If df is low, this distribution is very wide. If df is high (more than 20 or so), it is almost indistinguishable from a Gaussian distribution. If df=1, the distribution is extremely wide (lots of outliers) and is identical to a Lorentzian distribution, also known as the Cauchy distribution. Prism uses this equation to generate random numbers from the t distribution with df degrees of freedom:
|