Please enable JavaScript to view this site.

Repeated-measures ANOVA compares the means of three or more matched groups. The term repeated-measures strictly applies only when you give treatments repeatedly to each subject, and the term randomized block is used when you randomly assign treatments within each group (block) of matched subjects. The analyses are identical for repeated-measures and randomized block experiments, and Prism always uses the term repeated-measures.

P value

The P value answers this question:

If all the populations really have the same mean (the treatments are ineffective), what is the chance that random sampling would result in means as far apart (or more so) as observed in this experiment?

If the overall P value is large, the data do not give you any reason to conclude that the means differ. Even if the true means were equal, you would not be surprised to find means this far apart just by chance. This is not the same as saying that the true means are the same. You just don't have compelling evidence that they differ.

If the overall P value is small, then it is unlikely that the differences you observed are due to random sampling. You can reject the idea that all the populations have identical means. This doesn't mean that every mean differs from every other mean, only that at least one differs from the rest. Look at the results of post tests to identify where the differences are.

Was the matching effective?

A repeated-measures experimental design can be very powerful, as it controls for factors that cause variability between subjects. If the matching is effective, the repeated-measures test will yield a smaller P value than an ordinary ANOVA. The repeated-measures test is more powerful because it separates between-subject variability from within-subject variability. If the pairing is ineffective, however, the repeated-measures test can be less powerful because it has fewer degrees of freedom.

Prism tests whether the matching was effective and reports a P value that tests the null hypothesis that the population row means are all equal. If this P value is low, you can conclude that the matching was effective. If the P value is high, you can conclude that the matching was not effective and should reconsider your experimental design.

F ratio and ANOVA table

The P values are calculated from the ANOVA table. With repeated-measures ANOVA, there are three sources of variability: between columns (treatments), between rows (individuals), and random (residual). The ANOVA table partitions the total sum-of-squares into those three components. It then adjusts for the number of groups and number of subjects (expressed as degrees of freedom) to compute two F ratios. The main F ratio tests the null hypothesis that the column means are identical. The other F ratio tests the null hypothesis that the row means are identical (this is the test for effective matching). In each case, the F ratio is expected to be near 1.0 if the null hypothesis is true. If F is large, the P value will be small.

If you don't accept the assumption of sphericity

If you checked the option to not accept the assumption of sphericity, Prism does two things differently.

It applies the correction of Geisser and Greenhouse. You'll see smaller degrees of freedom, which usually are not integers. The corresponding P value is higher than it would have been without that correction.

It reports the value of epsilon, which is a measure of how badly the data violate the assumption of sphericity.


Prism reports two different R2 values, computed by taking ratios of sum-of-squares (SS):

To quantify how large the treatment effects are. There are two ways to compute this. Prism uses the method described by Keppel (1), in which R2 is the variation due to treatment effects as a fraction of the sum of the variation due to treatment effects plus random variation. That text refers to the value as both R2 and also eta squared, and states that this value an estimate of the partial omega squared. It is computed simply as the SS treatment divided by the sum of the SS treatment plus the SSresidual. Note that variation between subjects (SSindividual) is not part of the calculation. This R2 is reported in the results section with the heading "Repeated measures ANOVA summary".

To quantify how effecting the effectiveness of matching. This R2 quantifies the fraction of total variation that is due to differences among subjects. It is computed as SSindividual divided by the SStotal, and reported within the results section with the heading "Was the matching effective".

Multiple comparisons tests and analysis checklist

Learn about multiple comparisons tests after repeated measures ANOVA.

Before interpreting the results, review the analysis checklist.


(1) G Keppel and TD Wickens, Design and Analysis, Fourth Edition, 2004, ISBN: 0135159415

© 1995-2019 GraphPad Software, LLC. All rights reserved.