The Kaplan-Meier [KM] estimator of the survival function is ˆS(t)=∏{i:ti≤t}[1−Δ¯N(ti)¯Y(ti)]. A uniformly consistent estimator of the variance of √n[ˆS(t)−S(t)] is ˆV(t)=ˆS2(t)ˆσ(t)=ˆS2(t)∫t0nd¯N(u)¯Y(u)[¯Y(u)−Δ¯N(u)] (Greenwood formula).
An asymptotic pointwise 100(1−α)% confidence interval for S(t) at a fixed t is (ˆS(t)[1−u1−α/2√ˆσ(t)/n],ˆS(t)[1+u1−α/2√ˆσ(t)/n]). Asymptotic simultaneous 100(1−α)% Hall-Wellner confidence bands for S(t) over t∈⟨0,τ⟩ (for some pre-specified τ) are (ˆS(t){1−k1−α(ˆK(τ))[1+ˆσ(t)]/√n},ˆS(t){1+k1−α(ˆK(τ))[1+ˆσ(t)]/√n},), where ˆK(t)=ˆσ(t)/[1+ˆσ(t)] and k1−α(t), t∈(0,1⟩, satisfies the equation P[supu∈⟨0,t⟩|B(u)|>k1−α(t)]=α, where B is the Brownian bridge.
There are variations of confidence intervals and confidence bounds for S(t) based on various transformations (log, log(−log), arcsin, …). Formulae for these intervals can be derived by the delta method.
library(survival)
fit <- survfit(Surv(x,delta)~1,data=dn,conf.type="plain",conf.int=0.9)
fit2 <- survfit(Surv(x,delta)~grp,data=dataname,conf.type="plain",conf.int=0.9)
cbind(fit$time,fit$lower,fit$upper)
summary(fit)
plot(fit2[1],conf.int=TRUE)
lines(fit2[2],conf.int=TRUE,col="red")
The function survfit
in the library survival
calculates pointwise confidence intervals. The argument conf.type
specifies the transformation (conf.type="plain"
means no transformation), the argument conf.int
specifies the coverage probability (default 0.95).
Confidence intervals are stored in the output object of the function survfit
, the components are called upper
and lower
. The contents of survfit
objects can be also displayed by the function summary
.
The function plot
called on survfit
objects plots the confidence intervals included in the input object. The logical argument conf.int
determines whether or not confidence intervals are plotted.
library(OIsurv)
out <- confBands(Surv(dn$x,dn$delta),confType="plain",confLevel=0.95,type="hall",tU=240)
lines(out,lty=3,col="blue")
library(km.ci)
fit <- survfit(Surv(x,delta)~1,data=dn,conf.type="plain",conf.int=0.95)
out <- km.ci(fit,conf.level=0.95,tl=0.03,tu=240,method="hall-wellner")
summary(out)
lines(out,lty=3,col="blue")
plot(out)
There are two different R libraries that can calculate Hall-Wellner simultaneous confidence bands.
library(OIsurv)
includes a function called confBands
, which requires a survival object as the input and returns a list of three vectors (time
, lower
, upper
). There is a method for plotting lines
from a confBands
object, but no method for plot
.
library(km.ci)
includes a function called km.ci
, which requires a survfit
object as the input and returns another survfit
object with recalculated lower
and upper
components. The output can be processed by any function that accepts survfit
objects – e.g., plot
, summary
, lines
.
Download the dataset km_all.RData.
The dataframe inside is called all
. It includes 101 observations and three variables. The observations are acute lymphatic leukemia [ALL] patients who had undergone bone marrow transplant. The variable time
contains time (in months) since transplantation to either death/relapse or end of follow up, whichever occured first. The outcome of interest is time to death or relapse of ALL (relapse-free survival). The variable delta
includes the event indicator (1 = death or relapse, 0 = censoring). The variable type
distinguishes two different types of transplant (1 = allogeneic, 2 = autologous).
Calculate and plot the Kaplan-Meier estimate, 95% pointwise confidence intervals and 95% Hall-Wellner confidence bounds for all patients together, for patients with allogeneic transplants, and for patients with autologous transplants.
Generate n=50 censored observations as follows: the survival distribution is Weibull with shape parameter α=0.7 and scale parameter 1/λ=2. Its expectation is Γ(1+1/α)/λ=2Γ(17/7)≐2.53. The censoring distribution is exponential with rate λ=0.2 (the expectation is 1/λ=5), independent of survival.
Calculate and plot the Kaplan-Meier estimator of S(t) together with 95% pointwise confidence intervals and 95% Hall-Wellner confidence bounds. Include the true survival function in the plot (use a different color). Include a legend explaining which curve is which.
Conduct a simulation study with data created according to Task 2 assignment. Generate 500 such datasets and estimate the probability that the true survival curve is wholly covered by the 95% pointwise confidence intervals and 95% Hall-Wellner confidence bounds (restrict the task to a reasonable finite interval).