We work with two independent censored samples of sizes \(n_1\) and \(n_2\), one has survival function \(S_1(t)\) and cumulative hazard \(\Lambda_1(t)\), the other has survival function \(S_2(t)\) and cumulative hazard \(\Lambda_2(t)\). The censoring distribution need not be the same in both groups.

Consider the hypothesis \(H_0: S_1(t)=S_2(t)\) against the alternative \(H_1: \exists t\in \mathbb{R}: S_1(t)\neq S_2(t)\), which is equivalent to testing \(H_0: \Lambda_1(t)=\Lambda_2(t)\) against the alternative \(H_1: \exists t\in \mathbb{R}: \Lambda_1(t)\neq \Lambda_2(t)\).

The class of weighted logrank test statistics for testing \(H_0\) against \(H_1\) is \[ W_K=\int_0^\infty K(s)\,d(\widehat{\Lambda}_1-\widehat{\Lambda}_2)(s), \] where \[ K(s)=\sqrt{\frac{n_1n_2}{n_1+n_2}} \frac{\overline{Y}_1(s)}{n_1}\, \frac{\overline{Y}_2(s)}{n_2}\,\frac{n_1+n_2}{\overline{Y}(s)}W(s) \] and \(W(s)\) is a weight function – it governs the power of the test against different alternatives.

\(W(s)=1\) is the

*logrank test*. It is most powerful against proportional hazards alternatives, i.e., \(\lambda_1(t)=c\lambda_2(t)\) for some positive constant \(c\neq 1\).\(W(s)=\widehat{S}(s-)\) is the

*Prentice-Wilcoxon test*(also known as*Peto-Prentice*or*Peto and Peto*test). It is especially powerful against alternatives with decreasing hazard ratios – strong early effects that weaken over time.\(W(s)=\{\widehat{S}(s-)\}^\rho\{1-\widehat{S}(s-)\}^\gamma\) is the \(G(\rho,\gamma)\) class of tests of Fleming-Harrington. The logrank test is a special case for \(\rho=\gamma=0\), the Prentice-Wilcoxon test is a special case for \(\rho=1,\ \gamma=0\). The \(G(0,1)\) test is especially powerful against alternatives with increasing hazard ratios – little initial differrence in hazards that gets stronger over time.

Let \(\widehat\sigma^2_K\) be the estimated variance of \(W_K\) under \(H_0\). It can be shown that, under \(H_0\), \[ \frac{W_K}{\widehat\sigma_K}\stackrel{D}{\longrightarrow}\mathsf{N}(0,1) \qquad\text{ and }\qquad \frac{W^2_K}{\widehat\sigma^2_K}\stackrel{D}{\longrightarrow}\chi^2_1. \]

Let \(t_1<t_2<\cdots<t_d\) be the ordered distinct failure times in both samples. The weighted logrank test statistic (without the normalizing constant depending on \(n_1\) and \(n_2\)) can be written as \[ \sum_{j=1}^d \biggl(\Delta\overline{N}_1(t_j)-\Delta\overline{N}(t_j) \frac{\overline{Y_1}(t_j)}{\overline{Y}(t_j)}\biggr) W(t_j). \]

The function `survdiff`

in the library `survival`

can calculate \(G(\rho,0)\) statistics. The default is \(\rho=0\), that is, the logrank test.

```
library(survival)
survdiff(Surv(time,delta)~type,data=all)
survdiff(Surv(time,delta)~type,data=all,rho=2)
```

The function `FHtestrcc`

in the library `FHtest`

can calculate \(G(\rho,\gamma)\) statistics for non-zero \(\gamma\). The default is \(\rho=0,\ \gamma=0\), that is, the logrank test. The parameter \(\gamma\) is entered as the argument `lambda`

. {FHtest}

```
library(FHtest)
FHtestrcc(Surv(time,delta)~type,data=all)
FHtestrcc(Surv(time,delta)~type,data=all,rho=0.5,lambda=2)
```

Download the dataset km_all.RData. See the previous exercise for the description of this dataset.

Using ordinary R functions (not the `survival`

library), calculate and print the following table:

\(j\) | \(t_j\) | \(d_{1j}\) | \(d_{j}\) | \(y_{1j}\) | \(y_j\) | \(e_j=d_j\frac{y_{1j}}{y_j}\) | \(d_{1j}-e_j\) |
---|---|---|---|---|---|---|---|

1 | … | … | … | … | … | … | … |

… | … | … | … | … | … | … | … |

d | … | … | … | … | … | … | … |

where \(j\) is the order of the failures, \(t_j\) is the ordered failure time, \(d_{1j}=\Delta\overline{N}_1(t_j)\), \(d_i=\Delta\overline{N}(t_j)\), \(y_{1j}=\overline{Y}_1(t_j)\), \(y_j=\overline{Y}(t_j)\).

Then calculate the numerator of the (standard) logrank test statistic and compare the result with the output of the functions `survdiff`

and `FHtestrcc`

.

In the `all`

data, investigate the difference in survival and hazard functions according to the type of transplant. Remember that the outcome is relapse-free survival.

Calculate and plot estimated survival functions by the type of transplant (1 = allogeneic, 2 = autologous). Distinguish the groups by color. Add a legend.

Calculate and plot Nelson-Aalen estimators of cumulative hazard functions by the type of transplant (1 = allogeneic, 2 = autologous). Distinguish the groups by color. Add a legend.

Calculate and plot smoothed hazard functions by the type of transplant (1 = allogeneic, 2 = autologous). Distinguish the groups by color. Add a legend.

**Hint for obtaining smoothed hazard functions:**

```
library(muhaz)
haz1=muhaz(all$time[all$type==1],all$delta[all$type==1])
haz2=muhaz(all$time[all$type==2],all$delta[all$type==2])
plot(haz1)
lines(haz2,col="red")
```

- Perform logrank and Prentice-Wilcoxon tests using functions
`survdiff`

and`FHtestrcc`

. Interpret the results.

Activate the dataset `veteran`

by calling `data(veteran)`

. It includes 137 observations and eight variables. The observations are lung cancer patients. The variable `time`

contains survival time (in days) since the start of treatment. The variable `status`

includes the event indicator (1 = death, 0 = censoring). The variable `trt`

distinguishes two different treatments (1 = standard, 2 = experimental). The treatment was randomly assigned to the patients.

Compare survival distributions between standard and experimental treatments. Conduct the analysis according to the steps 1–4 of the previous task.

Compare survival distributions between patients with Karnofsky scores up to 60 and over 60 (regardless of treatment assignment). These scores are stored in variable `karno`

. Conduct the analysis according to the steps 1–4 of the previous task.

Generate \(n_1=100\) censored observations from the Weibull distribution with shape parameter \(\alpha=0.7\) and scale parameter \(1/\lambda=2\). Its expectation is \(\Gamma(1+1/\alpha)/\lambda=2\Gamma(17/7)\doteq 2.53\). Generate another sample of \(n_2=100\) censored observations from the exponential distribution with the same mean. Let the censoring distribution in both samples be exponential with the rate \(\lambda=0.2\) (the expectation is \(1/\lambda=5\)), independent of survival. Put both samples in a single dataframe, introducing a variable distingushing which sample each observation comes from.

Compare survival distributions between the two groups using the generated data. Conduct the analysis according to the steps 1–4 of Task 2 (i.e., perform a descriptive analysis before you do the test). Display the true survival and hazard functions in the plots generated at steps 1–3 of the analysis.

Show that with uncensored data, the Gehan-Wilcoxon weighted logrank test with \(W(s)=\frac{\overline{Y}(s)}{n_1+n_2}\) (the weight is proportional to the total number of subjects at risk) is equivalent to the standard Wilcoxon test.

Due date: Dec. 7