KNOWLEDGEBASE - ARTICLE #447

Is it OK to compare data sets with different N values using one-way ANOVA, and post tests?

Yes, Prism handles unequal N just fine (but see note below about a bug in Prism 3).


If you select the Tukey post-test in Prism, you are actually doing the Tukey-Kramer test, which includes the extension by Kramer to allow for unequal sample sizes in that post test. The other post tests also handle unequal N.

Beware that the two of the assumptions of ANOVA -- that the data come from Gaussian distributions and that the scatter (SD) of all the groups is identical -- matter much more when sample size varies a lot between groups. If you have very different sample sizes, a small P value from ANOVA may be due do nongaussian data (or unequal variances) rather than differences among means.

There are two ways to enter data for one-way ANOVA into Prism. We have always suggested creating a data table with no subcolumns (a "Column" table, in Prism 5). Each column is one group, with values stacked in different rows. This has always worked fine.

The alternative way to enter data for one-way ANOVA is to create a data table with subcolumns (a "Grouped" table in Prism 5) and enter data all on one row, with replicates side-by-side. Prism 4 and 5 analyze these kind of data fine, but Prism 3 had a bug when data were entered this way.

Prism 3 used the number of subcolumns as "N" for each group, rather than the actual number of values entered.  If there were no missing values, Prism 3 calculations were correct. If some values were missing, Prism 3 used the wrong N values, and so gave incorrect results for both the overall one-way ANOVA results and multiple comparisons tests. This problem is easy to spot as the degrees of freedom are incorrect in the results. In one-way ANOVA, the df should equal the total number of values minus the number of groups. When data were entered on one row, Prism 3 reported that DF equaled number of cells minus number of columns. When counting cells, it included blank ones, so number of cells was equal to number of columns times the number of replicate subcolumns.  If there are no missing values (so same N for all groups), the number of values equals the number of cells, and Prism 3 gave the correct results.

This bug only affected one-way ANOVA in Prism 3 when you entered data on one row, with side-by-side replicates, and left some cells blank.  It was fixed in Prism 4.00. All our examples for one-way ANOVA use the alternative method -- stacking replicates into columns (with no subcolumns). This method worked perfectly in Prism 3, even with missing values.

Explore the Knowledgebase

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.