We will work with the dataset pbc
contained in the survival
library.
library(survival)
data(pbd)
str(pbc)
help(pbc)
Investigate the structure of the dataset and read the description included in the R help page for this data.
Plot survival curves and calculate the logrank test to compare time to death in subgroups defined by the following variables [continuous variables should be cut in sample quartiles to obtain four groups of approximately equal size]. Interpret the results.
Hint: write a function, call it repeatedly for the 12 variables
The function to fit the proportional hazards model is called coxph
. It’s syntax is [example]
fit=coxph(Surv(time,delta)~age+bili+hepato,data=pbc)
The logic of specifying the model formula is the same as with functions lm
or glm
. Printing the results and performing tests of hypotheses for model building is also very similar to other regression functions in R:
summary(fit)
anova(fit)
anova(fit1,fit2)
drop1(fit,test="Chisq")
The parameter estimation is based on the maximum partial likelihood method but the asymptotic properties, tests etc. are all the same as with ordinary likelihood methods, as applied, for example, in generalized linear models. In particular, a submodel is tested against a wider model by likelihood ratio tests performed by functions anova
or drop1
. The difference of log-likelihoods multiplied by 2 has asymptotically \(\chi^2_d\) distribution if the submodel is true, where \(d\) is the difference in the number of parameters. The tests shown in the parameter table of summary
function are Wald tests for zero values of the individual parameters.
Fit a proportional hazards model with grouped bilirubin as the single covariate. Interpret the estimated parameters. Test the hypothesis that bilirubin does not affect survival of primary biliary cirrhosis patients. Compare the results with the logrank test from Task 1.
Build a proportional hazards model for survival of primary biliary cirrhosis patients. Consider the variables listed in Task 1 as potential covariates. Continuous variables can be included in the linear form, or after transformation, or in the grouped form. Do not consider interactions. Keep only the covariates that have a significant effect on survival.
Interpret the effect of covariates that have been kept in the final model and test their effects on survival.