# Exercise 3: Testing equality of censored distributions

Back to course page

## Theory

### The problem

We work with two independent censored samples of sizes $$n_1$$ and $$n_2$$, one has survival function $$S_1(t)$$ and cumulative hazard $$\Lambda_1(t)$$, the other has survival function $$S_2(t)$$ and cumulative hazard $$\Lambda_2(t)$$. The censoring distribution need not be the same in both groups.

Consider the hypothesis $$H_0: S_1(t)=S_2(t)$$ against the alternative $$H_1: \exists t\in \mathbb{R}: S_1(t)\neq S_2(t)$$, which is equivalent to testing $$H_0: \Lambda_1(t)=\Lambda_2(t)$$ against the alternative $$H_1: \exists t\in \mathbb{R}: \Lambda_1(t)\neq \Lambda_2(t)$$.

### Weighted logrank statistics

The class of weighted logrank test statistics for testing $$H_0$$ against $$H_1$$ is $W_K=\int_0^\infty K(s)\,d(\widehat{\Lambda}_1-\widehat{\Lambda}_2)(s),$ where $K(s)=\sqrt{\frac{n_1n_2}{n_1+n_2}} \frac{\overline{Y}_1(s)}{n_1}\, \frac{\overline{Y}_2(s)}{n_2}\,\frac{n_1+n_2}{\overline{Y}(s)}W(s)$ and $$W(s)$$ is a weight function – it governs the power of the test against different alternatives.

1. $$W(s)=1$$ is the logrank test. It is most powerful against proportional hazards alternatives, i.e., $$\lambda_1(t)=c\lambda_2(t)$$ for some positive constant $$c\neq 1$$.

2. $$W(s)=\widehat{S}(s-)$$ is the Prentice-Wilcoxon test (also known as Peto-Prentice or Peto and Peto test). It is especially powerful against alternatives with decreasing hazard ratios – strong early effects that weaken over time.

3. $$W(s)=\{\widehat{S}(s-)\}^\rho\{1-\widehat{S}(s-)\}^\gamma$$ is the $$G(\rho,\gamma)$$ class of tests of Fleming-Harrington. The logrank test is a special case for $$\rho=\gamma=0$$, the Prentice-Wilcoxon test is a special case for $$\rho=1,\ \gamma=0$$. The $$G(0,1)$$ test is especially powerful against alternatives with increasing hazard ratios – little initial differrence in hazards that gets stronger over time.

Let $$\widehat\sigma^2_K$$ be the estimated variance of $$W_K$$ under $$H_0$$. It can be shown that, under $$H_0$$, $\frac{W_K}{\widehat\sigma_K}\stackrel{D}{\longrightarrow}\mathsf{N}(0,1) \qquad\text{ and }\qquad \frac{W^2_K}{\widehat\sigma^2_K}\stackrel{D}{\longrightarrow}\chi^2_1.$

### Calculation of weighted logrank statistics

Let $$t_1<t_2<\cdots<t_d$$ be the ordered distinct failure times in both samples. The weighted logrank test statistic (without the normalizing constant depending on $$n_1$$ and $$n_2$$) can be written as $\sum_{j=1}^d \biggl(\Delta\overline{N}_1(t_j)-\Delta\overline{N}(t_j) \frac{\overline{Y_1}(t_j)}{\overline{Y}(t_j)}\biggr) W(t_j).$

## Implementation in R

The function survdiff in the library survival can calculate $$G(\rho,0)$$ statistics. The default is $$\rho=0$$, that is, the logrank test.

library(survival)
survdiff(Surv(time,delta)~type,data=all)
survdiff(Surv(time,delta)~type,data=all,rho=2)

The function FHtestrcc in the library FHtest can calculate $$G(\rho,\gamma)$$ statistics for non-zero $$\gamma$$. The default is $$\rho=0,\ \gamma=0$$, that is, the logrank test. The parameter $$\gamma$$ is entered as the argument lambda. {FHtest}

library(FHtest)
FHtestrcc(Surv(time,delta)~type,data=all)
FHtestrcc(Surv(time,delta)~type,data=all,rho=0.5,lambda=2)

Download the dataset km_all.RData. See the previous exercise for the description of this dataset.

Using ordinary R functions (not the survival library), calculate and print the following table:

$$j$$ $$t_j$$ $$d_{1j}$$ $$d_{j}$$ $$y_{1j}$$ $$y_j$$ $$e_j=d_j\frac{y_{1j}}{y_j}$$ $$d_{1j}-e_j$$
1
d

where $$j$$ is the order of the failures, $$t_j$$ is the ordered failure time, $$d_{1j}=\Delta\overline{N}_1(t_j)$$, $$d_i=\Delta\overline{N}(t_j)$$, $$y_{1j}=\overline{Y}_1(t_j)$$, $$y_j=\overline{Y}(t_j)$$.

Then calculate the numerator of the (standard) logrank test statistic and compare the result with the output of the functions survdiff and FHtestrcc.

In the all data, investigate the difference in survival and hazard functions according to the type of transplant. Remember that the outcome is relapse-free survival.

1. Calculate and plot estimated survival functions by the type of transplant (1 = allogeneic, 2 = autologous). Distinguish the groups by color. Add a legend.

2. Calculate and plot Nelson-Aalen estimators of cumulative hazard functions by the type of transplant (1 = allogeneic, 2 = autologous). Distinguish the groups by color. Add a legend.

3. Calculate and plot smoothed hazard functions by the type of transplant (1 = allogeneic, 2 = autologous). Distinguish the groups by color. Add a legend.

Hint for obtaining smoothed hazard functions:

library(muhaz)

haz1=muhaz(all$time[all$type==1],all$delta[all$type==1])
haz2=muhaz(all$time[all$type==2],all$delta[all$type==2])

plot(haz1)
lines(haz2,col="red")
1. Perform logrank and Prentice-Wilcoxon tests using functions survdiff and FHtestrcc. Interpret the results.

Activate the dataset veteran by calling data(veteran). It includes 137 observations and eight variables. The observations are lung cancer patients. The variable time contains survival time (in days) since the start of treatment. The variable status includes the event indicator (1 = death, 0 = censoring). The variable trt distinguishes two different treatments (1 = standard, 2 = experimental). The treatment was randomly assigned to the patients.

Compare survival distributions between standard and experimental treatments. Conduct the analysis according to the steps 1–4 of the previous task.

Compare survival distributions between patients with Karnofsky scores up to 60 and over 60 (regardless of treatment assignment). These scores are stored in variable karno. Conduct the analysis according to the steps 1–4 of the previous task.

Generate $$n_1=100$$ censored observations from the Weibull distribution with shape parameter $$\alpha=0.7$$ and scale parameter $$1/\lambda=2$$. Its expectation is $$\Gamma(1+1/\alpha)/\lambda=2\Gamma(17/7)\doteq 2.53$$. Generate another sample of $$n_2=100$$ censored observations from the exponential distribution with the same mean. Let the censoring distribution in both samples be exponential with the rate $$\lambda=0.2$$ (the expectation is $$1/\lambda=5$$), independent of survival. Put both samples in a single dataframe, introducing a variable distingushing which sample each observation comes from.
Show that with uncensored data, the Gehan-Wilcoxon weighted logrank test with $$W(s)=\frac{\overline{Y}(s)}{n_1+n_2}$$ (the weight is proportional to the total number of subjects at risk) is equivalent to the standard Wilcoxon test.