+/-

Survival Analysis typically focuses on time to event data. In the most general sense, it consists of techniques for positive-valued random variables, such as

library(ggplot2)
library(survival)
library(KMsurv)

Definitions and notation

Failure time random variables are always non-negative. That is, if we denote the failure time by $T$, then $T\geq 0$.

A random variable X is called a censored failure time random variable if $X = \min(T,U)$, where $U$ is a non-negative censoring variable.

The illustration of survival data shows several features which are typically encountered in analysis of survival data:

The first feature is referred to as 'staggered entry'. The last two features relate to 'censoring' of the failure time events.

Types of censoring

Right-censoring

We call this right-censoring because the true unobserved event is to the right of our censoring time; i.e., all we know is that the event has not happened at the end of follow-up.

In addition to observing $X_i$, we also get to see the failure indicator: \[ \delta_i=\left\{\begin{array}{c} 1 \mbox{ if } T_i\leq U_i\\ 0 \mbox{ if } T_i> U_i \end{array}\right. \]

Some software packages instead assume we have a censoring indicator: \[ c_i=\left\{\begin{array}{c} 0 \mbox{ if } T_i\leq U_i\\ 1 \mbox{ if } T_i> U_i \end{array}\right. \]

Right-censoring is the most common type of censoring assumption we will deal with in survival analysis.

Left-censoring

Can only observe $Y_i=\max(T_i, U_i)$ and the failure indicators: \[ \delta_i=\left\{\begin{array}{c} 1 \mbox{ if } U_i\leq T_i\\ 0 \mbox{ if } U_i> T_i \end{array}\right. \] e.g. (Miller) study of age at which African children learn a task. Some already knew (left-censored), some learned during study (exact), some had not yet learned by end of study (right-censored).

Interval censoring

More definitions and notation

There are several equivalent ways to characterize the probability distribution of a survival random variable. Some of these are familiar; others are special to survival analysis. We will focus on the following terms:

Density function

Survivorship function

In other settings, the cumulative distribution function, $F(t)=\mathsf{P}[T\leq t]$, is of interest. In survival analysis, our interest tends to focus on the survival function, $S(t)$.

Hazard function

Sometimes called an instantaneous failure rate, the force of mortality, or the age-specific failure rate.

Cumulative hazard function

Several relationships

$S(t)$ and $\lambda(t)$ for a continuous r. v. \[ \lambda(t)=-\frac{\mbox{d}}{\mbox{d}t}[\log S(t)] \]

$S(t)$ and $\Lambda(t)$ for a continuous r. v. \[ S(t)=\exp\{-\Lambda(t)\} \]

$S(t)$ and $\Lambda(t)$ for a discrete r. v. (suppose that $a_j < t \leq a_{j+1}$) \[ S(t)=\prod_{j:\,a_j < t}(1-\lambda_j) \] Cox defines $\Lambda(t)=\sum_{j:\,a_j < t}\log (1-\lambda_j)$ so that $S(t)=\exp\{-\Lambda(t)\}$ in the discrete case, as well.

Measuring central tendency in survival

Median survival - call this $\tau$, is defined by \[ S(\tau)=0.5 \] Similarly, any other percentile could be defined.

In practice, we don't usually hit the median survival at exactly one of the failure times. In this case, the estimated median survival is the smallest time $\tau$ such that $S(\tau)\leq 0.5$

Estimating the survival or hazard function

If no censoring

The empirical estimate of the survival function, $\widehat{S}(t)$, is the proportion of individuals with event times greater than $t$.

With censoring

If there are censored observations, then $\widehat{S}(t)$ is not a good estimate of the true $S(t)$, so other non-parametric methods must be used to account for censoring (life-table methods, Kaplan-Meier estimator)

Preview of coming attractions

Next we will discuss the most famous non-parametric approach for estimating the survival distribution, called the Kaplan-Meier estimator.

To motivate the derivation of this estimator, we will first consider a set of survival times where there is no censoring.

The following are times to relapse (weeks) for 21 leukemia patients receiving control treatment (Table 1.1 of Cox & Oakes):

How would we estimate $S(10)$, the probability that an individual survives to time 10 or later?

Empirical survival function

When there is no censoring, the general formula is: \[ \widehat{S}(t)=\frac{\# \mbox{ of individuals with } T\geq t}{\mbox{total sample size}} \] In most software packages, the survival function is evaluated just after time $t$, i.e., at $t^+$. In this case, we only count the individuals with $T > t$.

Kaplan-Meier estimator

We know $\widehat{S}(6)=21/21$, because everyone survived at least until time 6 or greater. But, we can't say $\widehat{S}(7)=17/21$, because we don't know the status of the person who was censored at time 6.

In a 1958 paper in the Journal of the American Statistical Association, Kaplan and Meier proposed a way to nonparametrically estimate $S(t)$, even in the presence of censoring. The method is based on the ideas of conditional probability.

Suppose $a_k < t \leq a_{k+1}$. Then \[ S(t)=\mathsf{P}[T\geq a_{k+1}]=\mathsf{P}[T\geq a_1,\ldots,T\geq a_{k+1}]=\prod_{j=1}^{k}\{1-\mathsf{P}[T=a_j|T\geq a_j]\}=\prod_{j=1}^{k}\{1-\lambda_j\} \] So \[ \widehat{S}(t)=\prod_j^{ a_j < t }\left(1-\frac{r_j}{d_j}\right) \] where $d_j$ is the number of deaths at $a_j$ and $r_j$ is the number at risk at $a_j$.

Intuition behind the Kaplan-Meier estimator

Think of dividing the observed timespan of the study into a series of fine intervals so that there is a separate interval for each time of death or censoring:

Using the law of conditional probability, \[ \mathsf{P}[T\geq t]=\prod_j\mathsf{P}[\mbox{survive $j$ th interval } I_j | \mbox{ survived to start of } I_j] \] where the product is taken over all the intervals including or preceding time $t$.

General Formula for jth interval

It turns out we can write a general formula for the conditional probability of surviving the j-th interval that holds for all 4 cases.

We could use the same approach by grouping the event times into intervals (say, one interval for each month), and then counting up the number of deaths (events) in each to estimate the probability of surviving the interval (this is called the lifetable estimate).

However, the assumption that those censored last until the end of the interval wouldn't be quite accurate, so we would end up with a cruder approximation.

As the intervals get finer and finer, the approximations made in estimating the probabilities of getting through each interval become smaller and smaller, so that the estimator converges to the true $S(t)$.

This intuition clarifies why an alternative name for the KM is the product limit estimator.

KM estimator

Leukemia data set

The file leukAB.dat contains the failure/censoring times and the event indicators for the Leukemia example (treatment arm).

# read dataset, skip 2 lines of file:
leuk = read.table("http://www.karlin.mff.cuni.cz/~pesta/prednasky/NMFM404/Data/leukAB.dat",
  sep=" ", head=T, skip=4)
# print a few observations; t = time, f = event indicator
leuk[1:3,]

##   t f
## 1 6 0
## 2 6 1
## 3 6 1

fit.leuk = survfit(Surv(leuk$t, leuk$f) ~ 1)
summary(fit.leuk)

## Call: survfit(formula = Surv(leuk$t, leuk$f) ~ 1)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     6     21       3    0.857  0.0764        0.720        1.000
##     7     17       1    0.807  0.0869        0.653        0.996
##    10     15       1    0.753  0.0963        0.586        0.968
##    13     12       1    0.690  0.1068        0.510        0.935
##    16     11       1    0.627  0.1141        0.439        0.896
##    22      7       1    0.538  0.1282        0.337        0.858
##    23      6       1    0.448  0.1346        0.249        0.807

plot(fit.leuk, xlab="Weeks", ylab="Proportion Survivors", col=3)

Confidence intervals

Greenwood's formula

Variance of KM: \[ \mathsf{var}\widehat{S}(t)=\left[\widehat{S}(t)\right]^2\sum_j^{ \tau_j < t }\frac{d_j}{r_j(d_j-r_j)} \] For a 95% confidence interval, we could use \[ \widehat{S}(t)\pm z_{1-\alpha/2}\mathsf{se}\left(\widehat{S}(t)\right) \] where $\mathsf{se}\left(\widehat{S}(t)\right)$ is calculated using Greenwood's formula.

Log-log approach

Better approach: Get a 95% confidence interval for $L(t) = \log(- \log(S(t)))$, i.e., log-log approach. Since this quantity is unrestricted, the confidence interval will be in the proper range when we transform back.

Applying the delta method, we get: \[ \mathsf{var}\widehat{L}(t)=\frac{1}{\left[\log \widehat{S}(t)\right]^2}\sum_j^{ \tau_j < t }\frac{d_j}{r_j(d_j-r_j)} \] We take the square root of the above to get $\mathsf{se}\left(\widehat{L}(t)\right)$, and then form the confidence intervals as: \[ \widehat{S}(t)^{\exp\left\{\pm 1.96\mathsf{se}\left(\widehat{L}(t)\right)\right\}} \] R gives an option to calculate these bounds (use conf.type='log-log' in survfit).

Log stabilizing transformation

R (by default) uses a log transformation to stabilize the variance and allow for non-symmetric confidence intervals. This is what is normally done for the confidence interval of an estimated odds ratio. \[ \mathsf{var}\left(\log \widehat{L}(t)\right)=\sum_j^{ \tau_j < t }\frac{d_j}{r_j(d_j-r_j)} \]

summary(fit.leuk)     # CI = "log"

## Call: survfit(formula = Surv(leuk$t, leuk$f) ~ 1)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     6     21       3    0.857  0.0764        0.720        1.000
##     7     17       1    0.807  0.0869        0.653        0.996
##    10     15       1    0.753  0.0963        0.586        0.968
##    13     12       1    0.690  0.1068        0.510        0.935
##    16     11       1    0.627  0.1141        0.439        0.896
##    22      7       1    0.538  0.1282        0.337        0.858
##    23      6       1    0.448  0.1346        0.249        0.807

fit.leuk2 = survfit(Surv(leuk$t, leuk$f) ~ 1, conf.type="log-log")
summary(fit.leuk2)

## Call: survfit(formula = Surv(leuk$t, leuk$f) ~ 1, conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     6     21       3    0.857  0.0764        0.620        0.952
##     7     17       1    0.807  0.0869        0.563        0.923
##    10     15       1    0.753  0.0963        0.503        0.889
##    13     12       1    0.690  0.1068        0.432        0.849
##    16     11       1    0.627  0.1141        0.368        0.805
##    22      7       1    0.538  0.1282        0.268        0.747
##    23      6       1    0.448  0.1346        0.188        0.680

fit.leuk3 = survfit(Surv(leuk$t, leuk$f) ~ 1, conf.type="plain")
summary(fit.leuk3)        # notice CI symmetrical about S_hat

## Call: survfit(formula = Surv(leuk$t, leuk$f) ~ 1, conf.type = "plain")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##     6     21       3    0.857  0.0764        0.707        1.000
##     7     17       1    0.807  0.0869        0.636        0.977
##    10     15       1    0.753  0.0963        0.564        0.942
##    13     12       1    0.690  0.1068        0.481        0.900
##    16     11       1    0.627  0.1141        0.404        0.851
##    22      7       1    0.538  0.1282        0.286        0.789
##    23      6       1    0.448  0.1346        0.184        0.712

ggplot2 for KM

ggsurv <- function(s, CI = 'def', plot.cens = T, surv.col = 'gg.def',
                   cens.col = 'red', lty.est = 1, lty.ci = 2,
                   cens.shape = 3, back.white = F, xlab = 'Time',
                   ylab = 'Survival', main = ''){
 
  library(ggplot2)  
  strata <- ifelse(is.null(s$strata) ==T, 1, length(s$strata))
  stopifnot(length(surv.col) == 1 | length(surv.col) == strata)
  stopifnot(length(lty.est) == 1 | length(lty.est) == strata)
 
  ggsurv.s <- function(s, CI = 'def', plot.cens = T, surv.col = 'gg.def',
                       cens.col = 'red', lty.est = 1, lty.ci = 2,
                       cens.shape = 3, back.white = F, xlab = 'Time',
                       ylab = 'Survival', main = ''){
 
    dat <- data.frame(time = c(0, s$time),
                      surv = c(1, s$surv),
                      up = c(1, s$upper),
                      low = c(1, s$lower),
                      cens = c(0, s$n.censor))
    dat.cens <- subset(dat, cens != 0)
 
    col <- ifelse(surv.col == 'gg.def', 'black', surv.col)
 
    pl <- ggplot(dat, aes(x = time, y = surv)) + 
      xlab(xlab) + ylab(ylab) + ggtitle(main) + 
      geom_step(col = col, lty = lty.est)
 
    pl <- if(CI == T | CI == 'def') {
      pl + geom_step(aes(y = up), color = col, lty = lty.ci) +
        geom_step(aes(y = low), color = col, lty = lty.ci)
    } else (pl)
 
    pl <- if(plot.cens == T & length(dat.cens) > 0){
      pl + geom_point(data = dat.cens, aes(y = surv), shape = cens.shape,
                       col = cens.col)
    } else if (plot.cens == T & length(dat.cens) == 0){
      stop ('There are no censored observations') 
    } else(pl)
 
    pl <- if(back.white == T) {pl + theme_bw()
    } else (pl)
    pl
  }
 
  ggsurv.m <- function(s, CI = 'def', plot.cens = T, surv.col = 'gg.def',
                       cens.col = 'red', lty.est = 1, lty.ci = 2,
                       cens.shape = 3, back.white = F, xlab = 'Time',
                       ylab = 'Survival', main = '') {
    n <- s$strata
 
    groups <- factor(unlist(strsplit(names
                                     (s$strata), '='))[seq(2, 2*strata, by = 2)])
    gr.name <-  unlist(strsplit(names(s$strata), '='))[1]
    gr.df <- vector('list', strata)
    ind <- vector('list', strata)
    n.ind <- c(0,n); n.ind <- cumsum(n.ind)
    for(i in 1:strata) ind[[i]] <- (n.ind[i]+1):n.ind[i+1]
 
    for(i in 1:strata){
      gr.df[[i]] <- data.frame(
        time = c(0, s$time[ ind[[i]] ]),
        surv = c(1, s$surv[ ind[[i]] ]),
        up = c(1, s$upper[ ind[[i]] ]), 
        low = c(1, s$lower[ ind[[i]] ]),
        cens = c(0, s$n.censor[ ind[[i]] ]),
        group = rep(groups[i], n[i] + 1)) 
    }
 
    dat <- do.call(rbind, gr.df)
    dat.cens <- subset(dat, cens != 0)
 
    pl <- ggplot(dat, aes(x = time, y = surv, group = group)) + 
      xlab(xlab) + ylab(ylab) + ggtitle(main) + 
      geom_step(aes(col = group, lty = group))
 
    col <- if(length(surv.col == 1)){
      scale_colour_manual(name = gr.name, values = rep(surv.col, strata))
    } else{
      scale_colour_manual(name = gr.name, values = surv.col)
    }
 
    pl <- if(surv.col[1] != 'gg.def'){
      pl + col
    } else {pl + scale_colour_discrete(name = gr.name)}
 
    line <- if(length(lty.est) == 1){
      scale_linetype_manual(name = gr.name, values = rep(lty.est, strata))
    } else {scale_linetype_manual(name = gr.name, values = lty.est)}
 
    pl <- pl + line
 
    pl <- if(CI == T) {
      if(length(surv.col) > 1 && length(lty.est) > 1){
        stop('Either surv.col or lty.est should be of length 1 in order
             to plot 95% CI with multiple strata')
      }else if((length(surv.col) > 1 | surv.col == 'gg.def')[1]){
        pl + geom_step(aes(y = up, color = group), lty = lty.ci) +
          geom_step(aes(y = low, color = group), lty = lty.ci)
      } else{pl +  geom_step(aes(y = up, lty = group), col = surv.col) +
               geom_step(aes(y = low,lty = group), col = surv.col)}   
    } else {pl}
 
 
    pl <- if(plot.cens == T & length(dat.cens) > 0){
      pl + geom_point(data = dat.cens, aes(y = surv), shape = cens.shape,
                      col = cens.col)
    } else if (plot.cens == T & length(dat.cens) == 0){
      stop ('There are no censored observations') 
    } else(pl)
 
    pl <- if(back.white == T) {pl + theme_bw()
    } else (pl) 
    pl
  } 
  pl <- if(strata == 1) {ggsurv.s(s, CI , plot.cens, surv.col ,
                                  cens.col, lty.est, lty.ci,
                                  cens.shape, back.white, xlab,
                                  ylab, main) 
  } else {ggsurv.m(s, CI, plot.cens, surv.col ,
                   cens.col, lty.est, lty.ci,
                   cens.shape, back.white, xlab,
                   ylab, main)}
  pl
}

Lifetable or actuarial estimator

Our goal is still to estimate the survival function, hazard, and density function, but this is complicated by the fact that we don't know exactly when during each time interval an event occurs.

We could apply the K-M formula directly. However, this approach is unsatisfactory for grouped data ... it treats the problem as though it were in discrete time, with events happening only at 1 yr, 2 yr, etc. In fact, what we are trying to calculate here is the conditional probability of dying within the interval, given survival to the beginning of it.

We can assume that censorings occur on average halfway through the interval: \[ r_j'=r_j-c_j/2 \] The assumption yields the Actuarial Estimator. It is appropriate if censorings occur uniformly throughout the interval.

Quantities estimated

Constructing the lifetable

Incidentally, KM in KMsurv stands for Klein and Moeschberger, not Kaplan-Meier!

Angina Pectoris data

intEndpts = 0:16    # interval endpoints
deaths = c(456, 226, 152, 171, 135, 125, 83, 74, 51, 42, 43, 34, 18, 9, 6, 0)
cens = c(0, 39, 22, 23, 24, 107, 133, 102, 68, 64, 45, 53, 33, 27, 23, 30) # n censored
fitlt = lifetab(tis = intEndpts, ninit=2418, nlost=cens, nevent=deaths)
round(fitlt, 4) # restrict output to 4 decimals

##       nsubs nlost  nrisk nevent   surv    pdf hazard se.surv se.pdf
## 0-1    2418     0 2418.0    456 1.0000 0.1886 0.2082  0.0000 0.0080
## 1-2    1962    39 1942.5    226 0.8114 0.0944 0.1235  0.0080 0.0060
## 2-3    1697    22 1686.0    152 0.7170 0.0646 0.0944  0.0092 0.0051
## 3-4    1523    23 1511.5    171 0.6524 0.0738 0.1199  0.0097 0.0054
## 4-5    1329    24 1317.0    135 0.5786 0.0593 0.1080  0.0101 0.0049
## 5-6    1170   107 1116.5    125 0.5193 0.0581 0.1186  0.0103 0.0050
## 6-7     938   133  871.5     83 0.4611 0.0439 0.1000  0.0104 0.0047
## 7-8     722   102  671.0     74 0.4172 0.0460 0.1167  0.0105 0.0052
## 8-9     546    68  512.0     51 0.3712 0.0370 0.1048  0.0106 0.0050
## 9-10    427    64  395.0     42 0.3342 0.0355 0.1123  0.0107 0.0053
## 10-11   321    45  298.5     43 0.2987 0.0430 0.1552  0.0109 0.0063
## 11-12   233    53  206.5     34 0.2557 0.0421 0.1794  0.0111 0.0068
## 12-13   146    33  129.5     18 0.2136 0.0297 0.1494  0.0114 0.0067
## 13-14    95    27   81.5      9 0.1839 0.0203 0.1169  0.0118 0.0065
## 14-15    59    23   47.5      6 0.1636 0.0207 0.1348  0.0123 0.0080
## 15-16    30    30   15.0      0 0.1429     NA     NA  0.0133     NA
##       se.hazard
## 0-1      0.0097
## 1-2      0.0082
## 2-3      0.0076
## 3-4      0.0092
## 4-5      0.0093
## 5-6      0.0106
## 6-7      0.0110
## 7-8      0.0135
## 8-9      0.0147
## 9-10     0.0173
## 10-11    0.0236
## 11-12    0.0306
## 12-13    0.0351
## 13-14    0.0389
## 14-15    0.0549
## 15-16        NA

Duration of nursing home stay data

The National Center for Health Services Research studied 36 for-profit nursing homes to assess the effects of different financial incentives on length of stay. 'Treated' nursing homes received higher per diems for Medicaid patients, and bonuses for improving a patient's health and sending them home.

Suppose we wish to use the actuarial method, but the data do not come grouped.

Consider the treated nursing home patients, with length of stay (los) grouped into 100 day intervals:

nurshome = read.table("http://www.karlin.mff.cuni.cz/~pesta/prednasky/NMFM404/Data/NursingHome.dat",
  head=F, skip=14, col.names = c("los", "age", "rx", "gender", "married", "health", "censor"))
nurshome$int = floor(nurshome$los/100)*100    # group in 100 day intervals
nurshome = nurshome[nurshome$rx==1,]   # keep only treated homes
nurshome[1:3,]    # check out a few observations

##    los age rx gender married health censor  int
## 1   37  86  1      0       0      2      0    0
## 2   61  77  1      0       0      4      0    0
## 3 1084  75  1      0       0      3      1 1000

tab = table(nurshome$int, nurshome$censor)
intEndpts = (0:11)*100
ntotal = dim(nurshome)[1]  # nr patients
cens = tab[,2]   # nr censored in each interval
released = tab[,1]   # nr released in each interval
fitlt = lifetab(tis = intEndpts, ninit=ntotal, nlost=cens, nevent=released)
round(fitlt, 4)        # restrict output to 4 decimals

##           nsubs nlost nrisk nevent   surv    pdf hazard se.surv se.pdf
## 0-100       712     0 712.0    330 1.0000 0.0046 0.0060  0.0000  2e-04
## 100-200     382     0 382.0     86 0.5365 0.0012 0.0025  0.0187  1e-04
## 200-300     296     0 296.0     65 0.4157 0.0009 0.0025  0.0185  1e-04
## 300-400     231     0 231.0     38 0.3244 0.0005 0.0018  0.0175  1e-04
## 400-500     193     1 192.5     32 0.2711 0.0005 0.0018  0.0167  1e-04
## 500-600     160     0 160.0     13 0.2260 0.0002 0.0008  0.0157  1e-04
## 600-700     147     0 147.0     13 0.2076 0.0002 0.0009  0.0152  1e-04
## 700-800     134    30 119.0     10 0.1893 0.0002 0.0009  0.0147  0e+00
## 800-900      94    29  79.5      4 0.1734 0.0001 0.0005  0.0143  0e+00
## 900-1000     61    30  46.0      4 0.1647 0.0001 0.0009  0.0142  1e-04
## 1000-1100    27    27  13.5      0 0.1503     NA     NA  0.0147     NA
##           se.hazard
## 0-100         3e-04
## 100-200       3e-04
## 200-300       3e-04
## 300-400       3e-04
## 400-500       3e-04
## 500-600       2e-04
## 600-700       3e-04
## 700-800       3e-04
## 800-900       3e-04
## 900-1000      5e-04
## 1000-1100        NA

names(fitlt)  # check out components of fitlt object

##  [1] "nsubs"     "nlost"     "nrisk"     "nevent"    "surv"     
##  [6] "pdf"       "hazard"    "se.surv"   "se.pdf"    "se.hazard"

x = rep(intEndpts, rep(2,12))[2:23]
y = rep(fitlt$surv, rep(2,11))

plot(x, y, type="l", col=4, xlab="Time Interval (Days)", ylab="Survival (Life Table)")
title(main = "Duration of stay in nursing homes", cex=.6)

plot of chunk survival-lifetable-nursing

y = rep(fitlt$hazard, rep(2,11))

plot(x, y, type="l", col=6, xlab="Time Interval (Days)", ylab="Hazard (Life Table)")
title(main = "Duration of stay in nursing homes", cex=.6)

plot of chunk survival-lifetable-nursing2

Nelson-Aalen estimator

$\Lambda(t)$ can then be approximated by a sum: \[ \widehat{\Lambda}(t)=\sum_j\lambda_j\Delta \] where the sum is over intervals, $\lambda_j$ is the value of the hazard in the j-th interval and $\Delta$ is the width of each interval. Since $\lambda_j\Delta$ is approximately the probability of dying in the interval, we can further approximate by \[ \widehat{\Lambda}(t)=\sum_j\frac{d_j}{r_j} \] It follows that $\Lambda(t)$ will change only at death times, and hence we write the Nelson-Aalen estimator as: \[ \widehat{\Lambda}_{NA}(t)=\sum_{j}^{ \tau_j < t }\frac{d_j}{r_j} \]

Once we have $\widehat{\Lambda}_{NA}(t)$, we can also find another estimator of $S(t)$ (Fleming-Harrington): \[ \widehat{S}_{FM}(t)=\exp\{-\widehat{\Lambda}_{NA}(t)\} \]

Say we want to obtain the Fleming-Harrington estimate of the survival function for married females, in the healthiest initial subgroup, who are randomized to the untreated group of the nursing home study.

nurshome = read.table("http://www.karlin.mff.cuni.cz/~pesta/prednasky/NMFM404/Data/NursingHome.dat",
  head=F, skip=14, col.names = c("los", "age", "rx", "gender", "married", "health", "censor"))
(nurs2 = subset(nurshome, gender==0 & rx==0 & health==2 & married==1))

##      los age rx gender married health censor
## 144   89  78  0      0       1      2      0
## 362  123  75  0      0       1      2      0
## 427   38  78  0      0       1      2      0
## 601   24  82  0      0       1      2      0
## 719  185  86  0      0       1      2      0
## 736  113  73  0      0       1      2      0
## 1120  25  71  0      0       1      2      0
## 1343  14  73  0      0       1      2      0
## 1362 149  81  0      0       1      2      0
## 1461 168  72  0      0       1      2      0
## 1472  64  81  0      0       1      2      0
## 1489 234  72  0      0       1      2      0

fit.fh = survfit(Surv(nurs2$los, 1-nurs2$censor) ~ 1,
  type="fleming-harrington", conf.type="log-log")
summary(fit.fh)

## Call: survfit(formula = Surv(nurs2$los, 1 - nurs2$censor) ~ 1, type = "fleming-harrington", 
##     conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##    14     12       1   0.9200  0.0801       0.5244        0.989
##    24     11       1   0.8401  0.1085       0.4750        0.960
##    25     10       1   0.7601  0.1267       0.4056        0.920
##    38      9       1   0.6802  0.1388       0.3368        0.872
##    64      8       1   0.6003  0.1465       0.2718        0.819
##    89      7       1   0.5204  0.1502       0.2116        0.760
##   113      6       1   0.4405  0.1505       0.1564        0.696
##   123      5       1   0.3606  0.1472       0.1070        0.628
##   149      4       1   0.2809  0.1404       0.0641        0.556
##   168      3       1   0.2012  0.1299       0.0293        0.483
##   185      2       1   0.1221  0.1169       0.0059        0.422
##   234      1       1   0.0449     Inf       0.0000        1.000

# compare with KM:
fit.km = survfit(Surv(nurs2$los, 1-nurs2$censor) ~ 1,
  type="kaplan-meier", conf.type="log-log")
summary(fit.km)

## Call: survfit(formula = Surv(nurs2$los, 1 - nurs2$censor) ~ 1, type = "kaplan-meier", 
##     conf.type = "log-log")
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##    14     12       1   0.9167  0.0798      0.53898        0.988
##    24     11       1   0.8333  0.1076      0.48171        0.956
##    25     10       1   0.7500  0.1250      0.40842        0.912
##    38      9       1   0.6667  0.1361      0.33702        0.860
##    64      8       1   0.5833  0.1423      0.27014        0.801
##    89      7       1   0.5000  0.1443      0.20848        0.736
##   113      6       1   0.4167  0.1423      0.15247        0.665
##   123      5       1   0.3333  0.1361      0.10270        0.588
##   149      4       1   0.2500  0.1250      0.06014        0.505
##   168      3       1   0.1667  0.1076      0.02651        0.413
##   185      2       1   0.0833  0.0798      0.00505        0.311
##   234      1       1   0.0000     NaN           NA           NA

Comparison of survival curves

Two survival curves

\[ \begin{align*} H_0&:\,S_1(t)=S_2(t),\,\forall t\\ H_1&:\,\exists t:\,S_1(t)\neq S_2(t) \end{align*} \]

Cochran-Mantel-Haenszel Logrank test

It also has an intuitive appeal, building on standard methods for binary data. (Later we will see that it can also be obtained as the score test from a partial likelihood from the Cox Proportional Hazards model.)

The logrank test is obtained by constructing a 2x2 table at each distinct death time, and comparing the death rates between the two groups, conditional on the number at risk in the groups. The tables are then combined using the Cochran-Mantel-Haenszel test. Note that the logrank is sometimes called the Cox-Mantel test.

leuk = read.table("http://www.karlin.mff.cuni.cz/~pesta/prednasky/NMFM404/Data/leuk.dat",
  sep=" ", head=T, skip=7)
summary(leuk)

##      weeks           remiss           trtmt    
##  Min.   : 1.00   Min.   :0.0000   Min.   :0.0  
##  1st Qu.: 6.00   1st Qu.:0.0000   1st Qu.:0.0  
##  Median :10.50   Median :1.0000   Median :0.5  
##  Mean   :12.88   Mean   :0.7143   Mean   :0.5  
##  3rd Qu.:18.50   3rd Qu.:1.0000   3rd Qu.:1.0  
##  Max.   :35.00   Max.   :1.0000   Max.   :1.0

## Log Rank test
(leuk.lrt = survdiff(Surv(weeks,remiss) ~ trtmt, data = leuk))

## Call:
## survdiff(formula = Surv(weeks, remiss) ~ trtmt, data = leuk)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## trtmt=0 21       21     10.7      9.77      16.8
## trtmt=1 21        9     19.3      5.46      16.8
## 
##  Chisq= 16.8  on 1 degrees of freedom, p= 4.17e-05

Peto-Peto logrank test

The logrank test can be derived by assigning scores to the ranks of the death times. This is called the Peto-Peto logrank test, or the linear rank logrank test.

Comparing the CMH-type Logrank and Peto-Peto Logrank

Peto-Peto-Prentice-Gehan-Wilcoxon test

This is the Prentice (1978) modification of the Peto & Peto (1972) modification of the Gehan (1965) modification of the Wilcoxon (1945) rank test.

## Peto-Peto-Prentice-Gehan-Wilcoxon Rank test
## notice: rho=1
(leuk.pppgw = survdiff(Surv(weeks,remiss) ~ trtmt, data = leuk, rho=1))

## Call:
## survdiff(formula = Surv(weeks, remiss) ~ trtmt, data = leuk, 
##     rho = 1)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## trtmt=0 21    14.55     7.68      6.16      14.5
## trtmt=1 21     5.12    12.00      3.94      14.5
## 
##  Chisq= 14.5  on 1 degrees of freedom, p= 0.000143

leuk.km = survfit(Surv(weeks, remiss) ~ trtmt, data=leuk)

plot(leuk.km, xlab = "Weeks", ylab = "Survival", col=c(2,3))
legend(18, .95, legend=c("No treatment", "Treatment"),
  col=c(2:3), lty=1)
title(main="Treatment Comparison, Leukemia Data", cex=.7)

(ggkm <- ggsurv(leuk.km))

Which test should we use?

Three or more survival curves - $P$-sample logrank

There are more than two groups, and the question of interest is whether the groups differ from each other.

Example: Time taken to finish a test with 3 different noise distractions. All tests were stopped after 12 minutes.

time = c(9, 9.5, 9, 8.5, 10, 10.5, 10, 12, 12, 11, 12,
  10.5, 12, 12, 12, 12, 12, 12)
event = c(1,1,1,1,1,1,1,1,0,1,1,1,1,0,0,0,0,0)
group = rep(1:3, times=c(6,6,6))
(cbind(group,time,event))

##       group time event
##  [1,]     1  9.0     1
##  [2,]     1  9.5     1
##  [3,]     1  9.0     1
##  [4,]     1  8.5     1
##  [5,]     1 10.0     1
##  [6,]     1 10.5     1
##  [7,]     2 10.0     1
##  [8,]     2 12.0     1
##  [9,]     2 12.0     0
## [10,]     2 11.0     1
## [11,]     2 12.0     1
## [12,]     2 10.5     1
## [13,]     3 12.0     1
## [14,]     3 12.0     0
## [15,]     3 12.0     0
## [16,]     3 12.0     0
## [17,]     3 12.0     0
## [18,]     3 12.0     0

(noise.lrt = survdiff(Surv(time, event) ~ group, rho=0))

## Call:
## survdiff(formula = Surv(time, event) ~ group, rho = 0)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## group=1 6        6     1.57   12.4463   17.2379
## group=2 6        5     4.53    0.0488    0.0876
## group=3 6        1     5.90    4.0660    9.4495
## 
##  Chisq= 20.4  on 2 degrees of freedom, p= 3.75e-05

noise.km = survfit(Surv(time, event) ~ group)
plot(noise.km, col = 2:4, xlab = "Minutes",
  ylab = "Time to finish test",
  main = "Time to finish test for 3 noise levels")

Stratified Logrank

Sometimes, even though we are interested in comparing two groups (or maybe $P$) groups, we know there are other factors that also affect the outcome. It would be useful to adjust for these other factors in some way.

Example: For the nursing home data, a logrank test comparing length of stay for those under and over 85 years of age suggests a significant difference ($p=0.03$). However, we know that gender has a strong association with length of stay, and also age. Hence, it would be a good idea to stratify the analysis by gender when trying to assess the age effect.

A stratified logrank allows one to compare groups, but allows the shapes of the hazards of the different groups to differ across strata. It makes the assumption that the group 1 vs group 2 hazard ratio is constant across strata. In other words: $\lambda_{1s}(t)/\lambda_{2s}(t)=\theta$ where $\theta$ is constant over the strata ($s =1,\ldots,S$).

nurshome =
  read.table("http://www.karlin.mff.cuni.cz/~pesta/prednasky/NMFM404/Data/NursingHome.dat",
  head=F, skip=14, col.names = c("los", "age", "rx", "gender", "married", "health", "censor"))
nurshome$discharged = 1 - nurshome$censor # event indicator
nurshome$under85 = (nurshome$age < 85) + 0
(nurs.slrt = survdiff( Surv(los, discharged) ~ under85 + strata(gender), data = nurshome))

## Call:
## survdiff(formula = Surv(los, discharged) ~ under85 + strata(gender), 
##     data = nurshome)
## 
##             N Observed Expected (O-E)^2/E (O-E)^2/V
## under85=0 689      537      560     0.926      1.73
## under85=1 912      742      719     0.721      1.73
## 
##  Chisq= 1.7  on 1 degrees of freedom, p= 0.189

Survival data analysis