Frequently Asked Questions
Hazard ratio from survival analysis.
FAQ# 1226 Last Modified 1January2009
Definition of the hazard ratio
Hazard is defined as the slope of the survival curve — a measure of how rapidly subjects are dying.
The hazard ratio compares two treatments. If the hazard ratio is 2.0, then the rate of deaths in one treatment group is twice the rate in the other group.
The hazard ratio is not computed at any one time point, but includes all the data in the survival curve.
Since there is only one hazard ratio reported, it can can only be interpreted if you assume that the population hazard ratio is consistent over time, and that any differences are due to random sampling.
If the hazard ratio is not consistent over time, the value that Prism reports for the hazard ratio will not be useful. If two survival curves cross, the hazard ratios are certainly not consistent (unless they cross at late time points, when there are few subjects still being followed so there is a lot of uncertainty in the true position of the survival curves).
Note that a hazard ratio of two does not mean that the median survival time is doubled (or halved). A hazard ratio of two means a patient in one treatment group who has not died (or progressed, or whatever end point is tracked) at a certain time point has twice the probability of having died (or progressed...) by the next time point compared to a patient in the other treatment group.
For other cautions about interpreting hazard ratios, see these two review papers:
 Hazards of Hazard Ratios (M.A. Hernán. Epidemiology. 21:135, 2010).
 Hazard ratio in clinical trials. ( Spruance et al, Antimicrobial Agents and Chemotherapy vol. 48 (8) pp. 2787, 2004).
How the hazard ratio is computed
There are two very similar ways of doing survival calculations: logrank, and MantelHaenszel. Both are explained in chapter 3 of Machin, Cheung and Parmar,Survival Analysis (details below).
The Mantel Haneszel approach uses these steps:
 Compute the total variance, V, as explained on page 3840 of a handout by Michael Vaeth. Note that he calls the test "logrank" but in a note explains that this is the more accurate test, and also gives the equation for the simpler approximation that we call logrank.
 Compute K = (O1  E1) / V, where O1  is the total observed number of events in group1 E1  is the total expected number of events in group1. You'd get the same value of K if you used the other group.
 The hazard ratio equals EXP(K)

The lower 95% confidence limit of the hazard ratio equals:
EXP(K  1.96/sqrt(V)) 
The upper 95% confidence limit equals:
EXP(K + 1.96/sqrt(V))
The logrank approach uses these steps:
 As part of the KaplanMeier calculations, compute the number of observed events (deaths, usually) in each group (Oa, and Ob), and the number of expected events assuming a null hypothesis of no difference in survival (Ea and Eb).

The hazard ratio then is:
HR= (Oa/Ea)/(Ob/Eb)  The standard error of the natural logarithm of the hazard ratio is: sqrt(1/Ea + 1/Eb)

The lower 95% confidence limit of the hazard ratio equals:
EXP( (OaEa)/V  1.96*sqrt(1/Ea + 1/Eb)) 
The upper 95% confidence limit equals:
EXP( (OaEa)/V + 1.96*sqrt(1/Ea + 1/Eb))
The two methods compared
The two usually give identical (or nearly identical) results. But the results can differ when several subjects die at the same time or when the hazard ratio is far from 1.0.
Bernstein and colleagues analyzed simulated data with both methods (1). In all their simulations, the assumption of proportional hazards was true. The two methods gave very similar values. The logrank method (which they refer to as the O/E method) reports values that are closer to 1.0 than the true Hazard Ratio, especially when the hazard ratio is large or the sample size is large.
When there are ties, both methods are less accurate. The logrank methods tend to report hazard ratios that are even closer to 1.0 (so the reported hazard ratio is too small when the hazard ratio is greater than 1.0, and too large when the hazard ratio is less than 1.0). The MantelHaenszel method, in contrast, reports hazard ratios that are further from 1.0 (so the reported hazard ratio is too large when the hazard ratio is greater than 1.0, and too small when the hazard ratio is less than 1.0).
They did not test the two methods with data simulated where the assumption of proportional hazards is not true. I have seen one data set where the two estimate of HR were very different (by a factor of three), and the assumption of proportional hazards was dubious for those data (Excel file). It seems that the MantelHaenszel method gives more weight to differences in the hazard at late time points, while the logrank method gives equal weight everywhere (but I have not explored this in detail). If you see very different HR values with the two methods, think about whether the assumption of proportional hazards is reasonable. If that assumption is not reasonable, then of course the entire concept of a single hazard ratio describing the entire curve is not meaningful.
How Prism computes the Hazard Ratio
Prism 4 uses the logrank method to compute the hazard ratio, but uses the MantelHaenszel approach to calculate the confidence interval of the hazard ratio. The results can be inconsistent. In rare cases, the hazard ratio reported by Prism 4 could be outside the confidence interval of the hazard ratio reported by Prism 4.
Prism 5 computes both the hazard ratio, and its confidence interval, using the Mantel Haenszel approach.
References
1. L Bernstein, J. Anderson and MC Pike. Estimation of the proportional hazard in twotreatmentgroup clinical trials. Biometrics (1981) vol. 37 (3) pp. 513519
Survival Analysis: A Practical Approach  
by David Machin, Yin Bun Cheung, Mahesh Parmar  
IBSN:0470870400. List price:$115.00  
Buy from amazon.com for $85.22 