Fixed bug:Multiple comparisons after two-way repeated measures ANOVA with three or more columns (fixed in 4.03/4.0c)

There is a bug in how GraphPad Prism 3 and 4 (up to 4.02 and 4.0b) compute the post tests following repeated measures ANOVA. Also see this related page.

When does the bug occur and how does it affect the results?

The bug only occurs when you have arranged the data so that each row represents a different time point ard related (repeated measures) values are stacked into subcolumns. The post test calculations are correct when each column represents a different time point.

Depending on your data, the effect of the bug can vary. In many cases, the effect of the bug won't be noticeable. In other cases, the analysis mistakenly has too much power and finds 'significant' differences when it shouldn't. In other cases, the bug results in too little power and doesn't find 'significant' differences when it should.

How can I get the right answer?

This bug has been fixed in release GraphPad Prism 4.03 (windows) and 4.0c (Mac). 

There also is a workaround that will let you get the right post test results quite easily from older versions of Prism 3 or 4. The trick is to do the ANOVA twice. First do repeated-measures ANOVA to get the overall results (but not post tests). Then do ordinary, not repeated measures, two-way ANOVA to get post-test results. Why? When you are doing post tests comparing two groups at one time point, the fact that the experiment was done in a repeated-measures fashion is completely irrelevant. For each post test, you compare one group of subjects treated one way and another group of subjects treated the other way. It really doesn't matter if you used those same subjects at other time points (repeated measures) or different subjects at other time points (ordinary ANOVA). So the post tests following ordinary ANOVA are exactly right. With the bug fixed in 4.03 and 4.0c, the post tests following ordinary ANOVA and repeated-measures ANOVA (when each row represents a different time point) are identical.

Is two-way repeated measures ANOVA the right way to analyze my data?

Before recalculating results, first consider whether repeated-measures two-way ANOVA is the best way to analyze your data. I think this method is overused by biologists when one of the factors is time or concentration.

Detailed explanation

The post tests compare the difference between two group means with the square root of an appropriate Mean Square (MS) value from the ANOVA table, taking into account the corresponding number of degrees of freedom (DF).  Prism only offers post tests comparing two columns at a certain row, but the repeated measures factor can be by row or by column.

When each column represents a different time point, there is no bug. You are looking at one treatment (for each post test) and comparing two time points. This is sort of like a paired t test. In this case, the ANOVA table partitions the overall variation into that due to the treatment factor, that due to the repeated measurement factor, that due to variation between subjects, and remaining experimental (residual) error. The appropriate MS value to use in the post test is MSresidual, and the corresponding DF value is DFresidual. Prism has always handled this situation correctly. For this case (each column is a different time point), the post test correctly corrects for the repeated measurements.

When each row represents a different time point, the bug causes Prism to do the calcualations incorrectly. For each post test, you are comparing two groups at one time point. So for that comparison, the fact that the study was repeated measures really doesn't matter. At each time point you are looking at, you have one group of subjects treated one way and another group of subjects treated the other way. While the ANOVA table separated variation between subjects from experimental (residual) error, this distinction is irrelevant to the post test. So the correct MS value to use in the post test combines both the MSresidual and the MSsubject. This MS value (which is called MSwithincells and is not shown on the ANOVA table), can be computed from the sum-of-squares (SS) terms in the ANOVA table.

  MSwithincells = [(SSsubject + SSresidual) / (DFsubject + DFresidual)]

The appropriate degrees of freedom for the post test is the sum of DFsubject and DFresidual.

MSwithincells can also be defined in this equivalent form:

   (MSsubject*DFsubject + MSresidual*DFresidual) / (DFsubject + DFresidual)

You can see that MSwithincells is a weighted average of MSsubject and MSresidual, so its value lies between MSsubject and MSresidual.

When you perform regular (not repeated measures) ANOVA, the MSresidual term accounts for all the variability not explained by either the row or column factor. So the MSresidual term in ordinary two-way ANOVA is identical to MSwithincells defined above for repeated measures ANOVA.

The bug is that Prism 3 and 4 (up to 4.02 and 4.0b) uses the MSsubject and DFsubject terms when performing post tests. 

When MSsubject and MSresidual have similar values, the bug has very little impact. When MSsubject and MSresidual have very different values, the effect of the bug on the post tests can be profound. Since MSsubject can either be larger or smaller than MSresidual, the effect of the bug can go in either direction.


G Keppel and TD Wickens, page 452.

SE Maxwell and HD Delaney, page 604.

Explore the Knowledgebase

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.