KNOWLEDGEBASE - ARTICLE #1313

Beware of running averages and smoothed data.


The figure shows the number of hurricanes over time. The left panel shows the number of hurricanes in each year, which jumps around a lot. To make it easier to spot trends, the right panel shows a rolling average. The value plotted for each year is the average of the number of hurricanes for that year plus the prior eight years. This smoothing lets you see a clear trend.

But there is a problem. The values plotted in the left panel were chosen randomly (from a Poisson distribution, with a mean of 10). There is no pattern. Each value was randomly generated without regard to the previous (or later) values.

Creating the running average creates the impression of trends by ensuring that any large random swing to a high or low value is amplified, while the year-to-year variability is muted.

Any calculation of correlation, or of linear or nonlinear regression, on the rolling average data would be invalid because the values in the rolling average are not independent of each other. Rather each value is included as part of the neighboring values.

This example is adapted from Briggs (2008), as does the warning about  misleading correlation coefficients.



Keywords: Running mean, rolling mean, rolling average, running average, smoothing, smooth curve, Savitsky Golay

Explore the Knowledgebase

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.