Every ANOVA produces an ANOVA table that partitions variation into components. Understanding this table is key to interpreting results.
Source of variation |
What it represents |
Between groups (or factor effects) |
Variation due to differences between group means |
Within groups (or residual/error) |
Variation within each group due to individual differences |
Total |
Total variation in all of the data combined |
•Sum of Squares (SS): Quantifies the amount of variation from each source
•Degrees of Freedom (df): Number of independent pieces of information
•Mean Square (MS): SS divided by df - represents variance from each source
•F-ratio: MS(between) / MS(within) - the test statistic
•P value: Probability of observing the determined F-ratio (or larger) assuming that the groups are truly equal
F ≈ 1: Between-group and within-group variation are similar (no effect)
F >> 1: Between-group variation is much larger than within-group variation (likely effect)
When you have two or more factors, ANOVA can test for interactions between them. An interaction occurs when the effect of one factor depends on the level of another factor.
An interaction effect represents the condition in which the effect of one factor depends on the value (or level) of another factor. This is often best understood by example:
No interaction example: A drug lowers blood pressure by 10 points in both men and women. The drug effect is the same regardless of gender - no interaction.
Interaction example: A drug lowers blood pressure by 20 points in men but only 5 points in women. The drug effect depends on gender - this is an interaction.
Interactions are often the most interesting finding in a multifactor experiment. They tell you that:
•Effects are not simple or additive
•You cannot fully understand one factor without considering the other
•Subgroup analyses may be warranted
•Two-way interaction: Between two factors (A × B)
•Three-way interaction: Between three factors (A × B × C)
•Higher-order interactions: Four-way and beyond (increasingly difficult to interpret). Note that Prism only offers the possibility to include three-way interactions (where applicable) even in multifactor ANOVA.
ANOVA tells you whether at least one group differs from the others, but doesn't tell you which groups are different. Multiple comparisons tests (also called post hoc tests) answer this question.
Consider a one-way ANOVA comparing five different groups. If the F statistic determined for this analysis results in a P value smaller than the specified alpha threshold (typically 0.05), you would reject the null hypothesis that the groups are equal. However, you would have no information on whether:
•All five groups differ from each other
•Just one group differs from all the others
•Groups fall into two or three subsets
Multiple comparisons (post hoc) tests allow you to investigate these different possibilities.
Comparing all pairs of groups
•Tukey's test (most common)
•Sidak's test
•Holm-Sidak test
Comparing each group to a control
•Dunnett's test
Multiple comparisons for Prism's multifactor ANOVA
•Main effects: compares groups within a single specified factor, averaged across all other factors
•Simple effects: compares groups within a single factor at specified levels of a second factor, AND averaged across any remaining factors
•Cell-by-cell comparisons: compares all unique combinations of selected factors, averaged across any remaining factors. The unique combination of the selected factors defines the groups to be compared, and the values of these groups are averaged over any remaining factors.
All statistical tests make assumptions about the data. When assumptions are violated, results may be inaccurate. Here are the key assumptions for ANOVA:
1.Independence of Observations
Assumption: Each observation is independent of all others. The value of one measurement doesn't influence any other measurement.
When this is violated:
•Repeated measurements on the same subject (use repeated measures ANOVA instead)
•Clustered data (e.g., multiple cells from the same animal, students in the same classroom)
•Time series data where consecutive measurements are correlated
What to do: Use a repeated measures ANOVA or nested ANOVA model when appropriate to account for the non-independence.
2. Normality
Assumption: The values in each group are sampled from a population with a normal (Gaussian) distribution.
How important is this?
•With large sample sizes (n > 30 per group), ANOVA is quite robust to non-normality
•With small or unequal sample sizes, non-normality is more problematic
•Extreme outliers or highly skewed data can be problematic
Testing normality:
•Prism can create normality tests and Q-Q plots
•Visual inspection of residuals is often most useful
•Remember: You're testing if the populations are normal, not just your sample
What to do if violated:
•Consider using lognormal ANOVA if your data are positively skewed
•Transform your data (log, square root, reciprocal) and rerun ANOVA
•Use nonparametric alternatives (Kruskal-Wallis, Friedman)
•With large samples, proceed with ANOVA (it's robust)
3.Homogeneity of Variance (Equal Variances)
Assumption: All groups are sampled from populations with equal variances (equal standard deviations).
Testing for equal variances:
•Prism automatically reports the Brown-Forsythe test and Bartlett's test with one-way ANOVA
•Brown-Forsythe is more robust and generally preferred
•Small P-value suggests variances are unequal
How important is this?
•More important when sample sizes are unequal
•ANOVA is fairly robust when sample sizes are equal and large
•More problematic with small sample sizes
What to do if violated:
•Transform your data to equalize variances
•Use Welch's or Brown-Forsythe ANOVA (one-way only)
•Consider whether different variances are themselves an important finding
•Use nonparametric tests (though these also assume equal dispersions)
4.Sphericity (Repeated Measures Only)
Assumption: The variances of the differences between all pairs of repeated measures are equal.
When this applies: Only for repeated measures ANOVA with three or more repeated measures.
Example: If you measure subjects at times 1, 2, 3, and 4:
•Variance of (time1 - time2) should equal variance of (time1 - time3)
•Should equal variance of (time1 - time4), (time2 - time3), etc.
Testing sphericity:
Prism reports Epsilon (ε) to quantify violations
•ε = 1.00: Assumption is met
•ε < 0.75: Substantial violation
What to do if violated:
•Don't assume sphericity in the analysis dialog
•Prism will apply the Geisser-Greenhouse correction automatically
•This correction reduces the degrees of freedom, making the test more conservative (larger P-values)
•Running multiple t-tests instead of ANOVA (inflates false positive rate)
•Ignoring the experimental design (using ordinary ANOVA when repeated measures is appropriate)
•Forgetting to check assumptions (especially with small samples)
•Over-interpreting non-significant results (absence of evidence ≠ evidence of absence)
•Focusing only on P-values (also report effect sizes and confidence intervals)
•Including too many factors (models may quickly become uninterpretable; consider simpler designs)
•Treating ordered factors as unordered (time, dose should often use regression or trend tests)
•Not visualizing interactions (tables alone don't make interactions clear)