|
How to: Frequency distribution |
|
1. Enter data Choose a Column table, and a column scatter graph. If you are not ready to enter your own data, choose the sample data set for frequency distributions. 2. Choose the analysis Click Analyze and then choose Frequency distribution from the list of analyses for Column data.
3. Choose analysis options Cumulative? In a frequency distribution, each bin contains the number of values that lie within the range of values that define the bin. In a cumulative distribution, each bin contains the number of values that fall within or below that bin. By definition, the last bin contains the total number of values. The graph below shows a frequency distribution on the left, and a cumulative distribution of the same data on the right, both plotting the number of values in each bin.
The main advantage of cumulative distributions is that you don't need to decide on a bin width. Instead, you can tabulate the exact cumulative distribution as shown below. The data set had 250 values, so this exact cumulative distribution has 250 points, making it a bit ragged.
Relative or absolute frequencies? Select Relative frequencies to determine the fraction (or percent) of values in each bin, rather than the actual number of values in each bin. For example, if 15 of 45 values fall into a bin, the relative frequency is 0.33 or 33%. If you choose both cumulative and relative frequencies, you can plot the distribution using a probabilities axis. When graphed this way, a Gaussian distribution is linear. Bin width If you chose a cumulative frequency distributions, we suggest that you choose to create an exact distribution. In this case, you don't choose a bin width as each value is plotted individually. To create an ordinary frequency distribution, you must decide on a bin width. If the bin width is too large, there will only be a few bins, so you will not get a good sense of how the values distribute. If the bin width is too low, many bins might have only a few values (or none) and so the number of values in adjacent bins can randomly fluctuate so much that you will not get a sense of how the data are distributed. How many bins do you need? Partly it depends on your goals. And partly it depends on sample size. If you have a large sample, you can have more bins and still have a smooth frequency distribution. One rule of thumb is aim for a number of bins equal to the log base 2 of sample size. Prism uses this as one of its two goals when it generates an automatic bin width (the other goal is to make the bin width be a round number). The figures below show the same data with three different bin widths. The graph in the middle displays the distribution of the data. The one on the left has too little detail, while the one on the right has too much detail.
Replicates If you entered replicate values, Prism can either place each replicate into its appropriate bin, or average the replicates and only place the mean into a bin. All values too small to fit in the first bin are omitted from the analysis. You can also enter an upper limit to omit larger values from the analysis. How to graph See these examples.
|