GraphPad Prism 10 Statistics Guide - Defining a model for Cox proportional hazards regression

Zoom Window Out
Larger Text | Smaller Text
Hide Page Header
Show Expanding Text
Printable Version
Save Permalink URL

Navigation: STATISTICS WITH PRISM 10 > Survival analysis > How to: Cox proportional hazards regression > Performing Cox proportional hazards regression

Defining a model for Cox proportional hazards regression

Scroll Prev Top Next More

Choose the time to event (response) variable

Select the variable from the data table that contains the elapsed time to the event of interest for the analysis. Note that - currently - Prism only accepts elapsed time for Cox proportional hazards regression. If your data contain a “start time” and “end time” variable, these will need to be converted to a single elapsed time value for each observation (elapsed time = end time - start time).

Choose the outcome (event/censor) variable

Select the variable from the data table that contains the information for each observation as to whether the event of interest occurred or if the observation was censored. This variable can be either continuous or categorical. If this variable is continuous, the information is often coded with values of "1" representing individuals that experienced the event of interest and values of "0" representing individuals that were censored. If the variable is a categorical variable, the levels may be as easy to interpret as "Died" and "Censored". Regardless if the outcome variable is continuous or categorical, the controls on this tab allow you to specify which value (or level) represents events and which represents censored observations.

An additional "Treat other values as" dropdown menu allows you to specify how Prism should handle any other value in the selected variable. Depending on the option selected, these rows may be treated as:

•Missing. With this option, Prism will treat any row with a value other than those specified for "Censored" and "Event" as if there was no value for this variable in this row at all. As a result, these rows will be omitted from the analysis. This is the default selection for this dropdown menu

•Censored: With this option, rows with values other than those specified for "Censored" and "Event" will be treated as censored observations. This can be helpful if your data were entered (or coded) to account for multiple types of censoring ("loss due to lack of follow-up" or "alive at the end of the study" for example)

•Deaths/events: With this option, any rows with values other than those specified for "Censored" and "Event" will be treated as events. This choice should only be selected if you're interested in studying the probability of all events, and not the difference between events. For example, treating "death due to car accident" and "death due to heart attack" may be OK if you're studying general survival probability, but in a study examining the effects of an experimental treatment on heart failure, it's probably not a good idea to treat these two as equal (in this case "death due to car accident" would probably be treated as a censored observation)

Choose ties estimation method

The Cox proportional hazards regression model requires that time to event information for every observation is recorded, and the order of these times is important to the calculation of this analysis. However, there are often situations in which multiple events were recorded to have the same elapsed time (either due to the way the data were collected or the fact that the specific ordering of the events was unknown). When event observations have the same elapsed times, these observations are said to be "tied", and there are different ways that the analysis can handle ties. By default, Prism will automatically select the best method for handling ties. Depending on the number of ties, Prism will either use the Exact method or Efron's method (see below for more information). Breslow's method is available only for purposes of matching results generated by other applications and is not generally recommended for use.

Additional information on ties methods

The exact method is generally considered to be the most accurate, and considers all possible permutations of ordering of the tied events when performing the required calculations. For example, if observations A, B, and C were tied, there are six different ways these could be permuted:

1.A, B, C

2.A, C, B

3.B, A, C

4.B, C, A

5.C, A, B

6.C, B, A

As the number of ties in the data set increases, the total number of permutations rapidly increases, causing the computation time to increase dramatically. To address this, some approximations to the exact method were developed. The two most common approximation methods are:

1.Breslow's approximation

2.Efron's approximation

Breslow's approximation is the older of the two methods, and is less accurate than Efron's approximation, especially when there are a large number of ties in the data. Because of this, we do not recommend using Breslow's approximation unless specifically trying to match results generated by a different application. Note that one method to compare the different estimation methods is to compare the AIC value for the model fit using each estimation method. This value is reported in the model diagnostics section of the results.

Define model

Like multiple linear regression and multiple logistic regression, Cox proportional hazards regression can accept both continuous and categorical variables as predictor variables in the model. However, one important difference between these other analyses and Cox proportional hazards regression is that there is no explicit intercept term for Cox proportional hazards regression. If an intercept term were to be included, it would simply be “absorbed” by the unspecified baseline hazard (h0(t)).

Main effects

Main effects may be variables that you're investigating (such as treatment group or genotype) or they may be variables that you're simply correcting for (covariates such as age, sex, weight, etc.). Although the interpretation of these variables may be different, there's no distinction from a model definition point of view. Use the checkboxes to specify which predictor variables to include in the model. When fitting the model, Prism will estimate a regression coefficient (beta coefficient) for each of the selected main effects in the model. When a categorical predictor variable is included, the number of regression coefficients estimated for this predictor will be equal to the number of levels of the categorical variable minus one (e.g. a categorical predictor variable with four levels will generate three estimated regression coefficients). A regression coefficient will also be estimated for each interaction and transformation included in the model (see below).

Interactions

Prism makes it very easy to include any number of two- or three-way interactions of independent predictor variables in the model. Simply expand the list of interactions (two-way or three-way) that you would like to include, select the first predictor variable of the interaction you would like to add, and then select which variable(s) to include in this interaction. To select all possible interactions with a given predictor variable, simply select the main checkbox beside this predictor variable in the two-way or three-way interactions section of the Define model window.

Transforms

In addition to interactions, Prism also allows you to include the square, cube, square root, logarithm, or exponential of any predictor variable in the model. Simply expand the section of the transformation of interest, and then specify which predictor variables you’d like to have transformed as part of the analysis model using the checkboxes.

Please enable JavaScript to view this site.