Please enable JavaScript to view this site.

Logrank and Gehan-Breslow-Wilcoxon tests

The null hypothesis for each of the tests that Prism performs is that the survival curves for each group are identical in the overall populations from which subjects in each group were sampled. In other words, if the two groups represent a “treatment” group and a “control” group, the null hypothesis would be that the treatment did not affect survival.

The P value for these tests answers this question:

If the null hypothesis is true, what is the probability of randomly selecting subjects whose survival curves are as different (or more so) than what was actually observed?

If the P value is small enough (smaller than a pre-specified threshold), then we say that the null hypothesis is rejected. Note that the P value is based on comparing entire survival curves, not on comparing only the median survival for each group.

The difference between the logrank test and the Gehan-Breslow-Wilcoxon test is that the latter places more weight on events of interest that occur at early time points. Note that Prism lets you choose one of two algorithms for computing the P value when comparing three or more groups. The results will show "(conservative)" or "(recommended)", to document your choice.

Logrank test for trend

If you choose to compare three or more survival curves, Prism will report the results for the overall logrank test as well as the results for the logrank test for trend.

When should you look at the results for the test for trend?

The test for trend is only relevant when the order of groups (defined by data set columns in Prism) is logical. Examples would be if the groups represent different age ranges, different disease severities, or different doses of a drug. The left-to-right order of these groups in the Survival data table must correspond to ordered and equally spaced categories. If the data are not ordered (or equally spaced), then you should ignore the results of the logrank test for trend.

Results of the logrank test for trend

The logrank test for trend reports a chi-square value, which is always associated with one degree of freedom (no matter how many groups are being compared). It uses that chi-square value to compute a P value testing the null hypothesis that there is no linear trend between the order of the groups and the median survival time. If the resulting P value is smaller than a pre-specified threshold (typically 0.05), this null hypothesis can be rejected.

Prism assumes the groups are equally spaced  

Computing the logrank test for trend requires assigning each group a code number. The test then looks at the trend between these group codes and survival. With some other programs, you are able to assign these codes manually, and thus deal with ordered groups that are not equally spaced. Prism uses the column number as the code, so it can only perform the test for trend assuming that the groups are equally spaced. Even if you enter numbers as column titles to represent code numbers, Prism will not use these to perform the test for trend.

How it works

The logrank test for trend looks at the linear trend between group code (specified by column number in Prism) and survival. However, it doesn’t simply look at median survival, or five-year survival, or any other specific summary measure. It first computes expected survival assuming that the null hypothesis (subjects in all groups are sampled from a population with the same survival experience) is true. Then it quantifies the overall discrepancy between the observed survival and the expected survival for each group. Finally, it looks at the trend between that discrepancy and the corresponding group code. For additional details, see the text by Machin (1).

Multiple comparison tests

After comparing three or more treatment groups, you may want to go back and perform pairwise comparison of individual survival curves (looking at two specific curves at a time). Prism does not do this automatically, but it is easy to duplicate the analysis, and change the new copy to only include the two groups to be compared. This process can be repeated for each desired pair of survival curves. Note that if you manually perform these pairwise comparisons, you will need to manually adjust the threshold for determining P value “significance”. Alternatively, you can copy each of the calculated P values for each pairwise comparison into a new column data table, and analyze that stack of P values.

More information on multiple comparisons of Kaplan-Meier survival curves can be found on this page.


1.David Machin, Yin Bun Cheung, Mahesh Parmar, Survival Analysis: A Practical Approach, 2nd edition, IBSN:0470870400.

© 1995-2019 GraphPad Software, LLC. All rights reserved.