Common Survival analysis of Oncology trials in R

Time-to-event endpoints are widely used in oncology trials, such as OS and PFS. And survival analysis is a common method for estimating time-to-event endpoints. In this blog, I’d like to make a note of how to summarize the essential results for survival analysis in oncology trials in R and also compare them with SAS.

Normally we will show the primary analyses, like below: - Descriptive statistics of the number of events and censors. - Median (and 25th, 75th percentile) survival time from Kaplan-Meier estimate, along with 95% CI that will be calculated via Brookmeyer and Crowley methodology using log-log transformation. - Survival rate at each time-point of interest from Kaplan-Meier estimate, along with 95% CI that will be calculated via Greenwood formula using log-log transformation. - Hazard Ratio or stratified Hazard Ratio, along with 95% CI from Cox proportional hazards (PH) model, adding Efron approximation for ties handling. - P-value of log-rank test or stratified log-rank test.

Above are the most prevalent survival analysis methods in the Statistical Analysis Plan (SAP) for oncology trials. Let's see how to implement them in R.

In this blog, I will use the example data from the Worcester Heart Attack Study (https://stats.idre.ucla.edu/sas/seminars/sas-survival/) with 500 subjects, which has been wrapped in stabiot R package. And you can find the description of all columns in ?whas500 after installing the package, like devtools::install_github("kaigu1990/stabiot").

library(survival)
library(stabiot)

data("whas500")

If we want to compare the survival time between the subjects with and without atrial fibrillation, we should first convert the AFB variable to a factor.

dat <- whas500 %>%
  mutate(
    AFB = factor(AFB, levels = c(1, 0))
  )

Kaplan-Meier estimate

Afterwards we can compute the Kaplan-Meier estimate of the survival function for the whas500 dataset.

fit_km <- survfit(Surv(LENFOL, FSTAT) ~ AFB, data = dat, conf.type = "log-log")

In the Surv() function, the event variable takes on the value 1 for events and 0 for censoring, which is in contrast to SAS. And the conf.type = "log-log" tells the function to estimate the CI of median or other percentiles via Brookmeyer and Crowley methodology using log-log transformation because the default argument is conf.type = "log".

Then we can use summary() to see more detail or obtain the median survival time.

print(summary(fit_km), digits = 4)

# median survival time with CI
summary(fit_km)$table

##       records n.max n.start events    rmean se(rmean)   median  0.95LCL  0.95UCL
## AFB=1      78    78      78     47 35.86989  3.821604 28.41889 13.76591 45.24025
## AFB=0     422   422     422    168 48.63073  1.714196 70.96509 51.77823       NA

Or use quantile() for any quantile estimate.

# 25% 50% and 75% survival time and CI
quantile(fit_km, probs = c(0.25, 0.5, 0.75)) 

## $quantile
##             25       50       75
## AFB=1  3.12115 28.41889 77.20739
## AFB=0 11.33470 70.96509 77.30595
## 
## $lower
##              25       50       75
## AFB=1 0.5585216 13.76591 50.85832
## AFB=0 6.1437372 51.77823 77.30595

## $upper
##             25       50 75
## AFB=1 10.77618 45.24025 NA
## AFB=0 17.41273       NA NA

If you want to know the survival rate at specific time points like 12, 24 and 36 months, use times = c(12, 36, 60) in the summary() function.

summary(fit_km, times = c(12, 36, 60))

## Call: survfit(formula = Surv(LENFOL, FSTAT) ~ AFB, data = dat, conf.type = "log-log")
## 
##                 AFB=1 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##    12     50      28    0.641  0.0543        0.524        0.736
##    36     27      12    0.455  0.0599        0.335        0.567
##    60     11       6    0.315  0.0643        0.195        0.441
## 
##                 AFB=0 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##    12    312     110    0.739  0.0214        0.695        0.779
##    36    199      32    0.645  0.0244        0.595        0.690
##    60     77      21    0.530  0.0311        0.467        0.589

The n.risk column gives us the number of subjects who are still in the risk condition at specific time points. The n.event column demonstrates the number of events that occurred at the time. And survival column tells us the survival rate from the KM estimate and the last two columns are the corresponding CI.

In addition, you may be interested in the difference rate and corresponding CI between groups with and without AFB. Now that you know the rate and SE for two groups seperately, thus the difference rate and difference SE can be simply calculated. Afterwards for CI calculation, utilize the qnorm() function as follows.

diff_rate <- diff(rate_tb$surv)
diff_se <- sqrt(sum(rate_tb$std.err^2))
diff_rate + c(-1, 1) * qnorm(1 - 0.05 / 2) * diff_se
## [1] -0.01608841  0.21271012

Log-rank test

The Log-rank test is a non-parametric test for comparing the survival function across two or more groups where the null hypothesis is that the groups's survival functions are the same. It can be calculated via survminer::surv_pvalue() function with method = "log-rank" for survfit object, or survival::survdiff() function with rho = 0. Both are part of the default set, so you don't need to define them explicitly. Let me show them separately, as shown below.

survminer::surv_pvalue(fit_km, method = "log-rank")
##   variable         pval   method    pval.txt
## 1      AFB 0.0009616214 Log-rank p = 0.00096

survival::survdiff(Surv(LENFOL, FSTAT) ~ AFB, data = dat, rho = 0)$pvalue
## [1] 0.0009616214

If you would like to know what is the Log rank test, this article (Log Rank Test) can be for your reference.

Cox PH model

As we will know, the Cox regression model is a semi-parametric model since it makes no assumption about the distribution of the event times that is similar to the KM method of non-parametric, but it relies on a partial likelihood estimation that is partially defined parametrically. Before fitting the Cox model, we should make sure the proportional hazard assumption is met. And more details can be seen at http://www.sthda.com/english/wiki/cox-model-assumptions.

Let's go on the whas500 example data. If you want to estimate the hazard ratio comparing those two groups and also specify the efron approximation for tie handling, the survival::coxph() function can be used for fitting Cox PH models simply.

fit_cox <- coxph(Surv(LENFOL, FSTAT) ~ AFB, data = dat, ties = "efron")

## Call:
## coxph(formula = Surv(LENFOL, FSTAT) ~ AFB, data = dat, ties = "efron")
## 
##   n= 500, number of events= 215 
## 
##         coef exp(coef) se(coef)      z Pr(>|z|)   
## AFB0 -0.5397    0.5829   0.1654 -3.263   0.0011 **
## ---
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
## 
##      exp(coef) exp(-coef) lower .95 upper .95
## AFB0    0.5829      1.716    0.4215    0.8061
## 
## Concordance= 0.537  (se = 0.014 )
## Likelihood ratio test= 9.58  on 1 df,   p=0.002
## Wald test            = 10.64  on 1 df,   p=0.001
## Score (logrank) test = 10.9  on 1 df,   p=0.001

The Cox results can be interpreted as follows: - The coef is the coefficient, and z is the Wald statistic value that corresponds to the rate of coef to its standard error (se(coef)). - Hazard ratio (HR) corresponds to the exp(coef), which is comparing the current level to the reference level. So the HR of 0.58 indicates that the subjects without AFB have 0.58 less hazard or risk compared to those who have AFB. In other words, if the event occurred in 20% of the no AFB group, it would occur in 8.4% (20% - (20% x 0.58)) of the AFB group, which means the no AFB can reduce the hazard of deaths by 0.58. - The HR confidence interval is also provided, with lower 95% bound of 0.4215 and upper 95% bound of 0.8061. - There are three alternative tests for overall significance of the model: likelihood-ratio test, Wald test, and score log-rank statistics. And as we know, log-rank test is a special case of Cox model, which means it equals a univariate Cox regression (only considering treatment). So we can calculate the log-rank p-value from 'survdiff()` as well for the Cox model.

Stratified Log-rank test and Cox PH model

The stratified log-rank test is commonly used for randomized clinical trials when there are baseline factors that may be related to the treatment effect.

The stratified log-rank test is the log-rank test that accounts for the difference in prognostic factors between the two groups. Specifically, we divide the data according to the levels of the significant prognostic factors and form a stratum for each level. At each level, we arrange the survival times in ascending order and calculate the observed number of events, expected number of events, and variance at each survival time as we would in the regular log-rank test. (Chapter 23 - An Introduction to Survival Analysis)

To implement the stratified log-rank test, simply include the strata() within the survival model formula as follows.

strat_km <- survfit(Surv(LENFOL, FSTAT) ~ AFB + strata(AGE, GENDER), data = dat, conf.type = "log-log")
survminer::surv_pvalue(strat_km, method = "log-rank")

##                  variable       pval   method  pval.txt
## 1 AFB+strata(AGE, GENDER) 0.08269744 Log-rank p = 0.083

Regarding the stratified Cox model, it can be used if there are one or more predictors that don’t satisfy the proportional hazard assumptions. In other words, the proportional hazard is violated.

It can also be performed using coxph along with strata() function, the same as stratified log-rank test.

strat_cox <- coxph(Surv(LENFOL, FSTAT) ~ AFB + strata(AGE, GENDER), data = dat, ties = "efron")
summary(strat_cox)
strat_cox %>% 
  broom::tidy(exponentiate = TRUE, conf.int = TRUE, conf.level = 0.95) %>%
  select(term, estimate, conf.low, conf.high)

## # A tibble: 1 × 4
##   term  estimate conf.low conf.high
##   <chr>    <dbl>    <dbl>     <dbl>
## 1 AFB0     0.695    0.462      1.05

Above is a summary of common survival analyses in R. In the next step, I would like to wrap these functions into one or two functions and specify the print method so that we can simply use them to compare with SAS.

Reference

https://stats.stackexchange.com/questions/486806/the-logrank-test-statistic-is-equivalent-to-the-score-of-a-cox-regression-is-th
https://discourse.datamethods.org/t/when-is-log-rank-preferred-over-univariable-cox-regression/2344
Introduction to Regression Methods for Public Health Using R
Survival Analysis in R Companion
Survival Analysis Using R
Survival analysis in clinical trials — Log-rank test
Survival analysis in clinical trials — Kaplan-Meier estimator