GraphPad Statistics Guide

Interpreting results: Bland-Altman

Interpreting results: Bland-Altman

Previous topic Next topic No expanding text in this topic  

Interpreting results: Bland-Altman

Previous topic Next topic JavaScript is required for expanding text JavaScript is required for the print function Mail us feedback on this topic!  

Difference vs. average

The first page of Bland-Altman results shows the difference and average values and is used to create the plot.

Bias and 95% limits of agreement

The second results page shows the average bias, or the average of the differences. The bias is computed as the value determined by one method minus the value determined by the other method. If one method is sometimes higher, and sometimes the other method is higher, the average of the differences will be close to zero. If it is not close to zero, this indicates that the two assay methods are systematically producing different results.

This page also shows the standard deviation (SD) of the differences between the two assay methods (labeled as the SD of bias). The SD value is not very useful by itself, but is used to calculate the limits of agreement, computed as the mean bias plus or minus 1.96 times its SD.

For any future sample, the difference between measurements using these two assay methods should lie within the limits of agreement approximately 95% of the time.

Actually, the limits of agreement are a description of the data. It is possible to compute 95% prediction bands for the difference, and these limits would be further from the bias in each direction than do the limits of agreement (especially when the sample is small).

Interpreting the Bland-Altman results

Bland-Altman plots are generally interpreted informally, without further analyses. Ask yourself these questions:

How big is the average discrepancy between methods (the bias)? You must interpret this clinically. Is the discrepancy large enough to be important? This is a clinical question, not a statistical one.

How wide are the limits of agreement? If it is wide (as defined clinically), the results are ambiguous. If the limits are narrow (and the bias is tiny), then the two methods are essentially equivalent.

Is there a trend? Does the difference between methods tend to get larger (or smaller) as the average increases?

Is the variability consistent across the graph? Does the scatter around the bias line get larger as the average gets higher?