This assignment is supposed to be solved via JAGS and
library(runjags). List through the manual
to find what you need. You can compare your results with outputs of
functions jointModel and jointModelBayes from
libraries JM and JMbayes.
We continue with the analysis of aids data from
library(JM), see help(aids) for more
details.
set.seed(123456789)
library(JM)
head(aids[,c("patient", "Time", "death", "CD4", "obstime", "drug", "start", "stop", "event")], 15)
## patient Time death CD4 obstime drug start stop event
## 1 1 16.97 0 10.677078 0 ddC 0 6.00 0
## 2 1 16.97 0 8.426150 6 ddC 6 12.00 0
## 3 1 16.97 0 9.433981 12 ddC 12 16.97 0
## 4 2 19.00 0 6.324555 0 ddI 0 6.00 0
## 5 2 19.00 0 8.124038 6 ddI 6 12.00 0
## 6 2 19.00 0 4.582576 12 ddI 12 18.00 0
## 7 2 19.00 0 5.000000 18 ddI 18 19.00 0
## 8 3 18.53 1 3.464102 0 ddI 0 2.00 0
## 9 3 18.53 1 3.605551 2 ddI 2 6.00 0
## 10 3 18.53 1 6.164414 6 ddI 6 18.53 1
## 11 4 12.70 0 3.872983 0 ddC 0 2.00 0
## 12 4 12.70 0 4.582576 2 ddC 2 6.00 0
## 13 4 12.70 0 2.645751 6 ddC 6 12.00 0
## 14 4 12.70 0 1.732051 12 ddC 12 12.70 0
## 15 5 15.13 0 7.280110 0 ddI 0 2.00 0
We assume LME model with random intercept and slope:
CD4 cell count of patient \(i\) at visit \(j\),drug level ddI,CD4 cell
count \(m_i(t)\) evolves linearly with
time for each patient differently \(m_i(t) =
B_{1i} + B_{2i} t\).Instead of piecewise-constant parametrization of CD4 for
modelling the survival time we use linear function \(m_i(t)\), which joins the LME and
proportional hazards model together. Then, the hazard function for
patient \(i\) is of the following form:
\[\begin{align*}
h_i(t) &= h_0(t) \, \exp\left\{ \gamma_1 D_i + \gamma_2 m_i(t)
\right\} \\
&= h_0(t) \, \exp\left\{ \gamma_1 D_i + \gamma_2 B_{1i} + t \cdot
\gamma_2 B_{2i} \right\}
\end{align*}\] In previous exercise we were able to estimate the
model under \(h_0(t) = 1\). However,
this assumption yields Gompertz distribution with monotonous hazard
function in time, which is not suitable function in this case. According
to smoothed derivative of Breslow estimator of cumulative
hazard function from classical semi-parametric Cox proportional hazards
model, the shape should resemble the following shape:
Such a shape could be fitted with the choice \(h(t) = \alpha t^{\alpha-1} \exp\{a + b t\}\), however, integrating it into the cumulative hazard function \(H(t)\) is tricky. If \(b<0\) we could use the incomplete gamma function, which is a distribution function of \(\Gamma(\cdot,\cdot)\) distribution. We wish to use it for different \(b_i = \gamma_2 B_{2i}\), how could we enforce \(b_i < 0\)?
Another idea would be to use \[ h(t) = \prod\limits_{k=1}^{K} \lambda_k^{\mathbf{1}(t_k \leq t< t_{k+1})} \exp\{a + b t\}, \] where \(0 = t_1 < t_2 < \cdots < t_{K+1} = \infty\) are predefined intervals, on which we have different baseline hazards \(\lambda_k\). This yields non-continuous piece-wise exponential hazard function. It should be straight-forward generalization of our previous attempt. However, the computation of log-hazard and cumulative hazard function becomes more complicated to evaluate: \[ \log h(t) = \log \lambda_{I(t)} + a + bt, \qquad H(t) = \frac{\exp\{a\}}{b} \left[\sum\limits_{k=1}^{I(t)-1} \lambda_k \left(\exp\{b t_{k+1}\} - \exp\{b t_k\}\right) + \lambda_{I(t)} \left(\exp\{b t\} - \exp\{b t_{I(t)}\}\right)\right], \] where \(I(t)\) is the index of the interval for which \(t_{I(t)} \leq t < t_{I(t)+1}\).
When we choose \(h_0(t) = \alpha t^{\alpha-1}\), we no longer obtain Weibull distribution! The distributional family is given by viewing \(h_i(t)\) as a function of \(t\), which would conceptually yield \[ h_i(t) \propto t^{\psi_i} \exp\{ \zeta_i \, t\}, \qquad H_i(t) = \mathrm{const.} \int\limits_0^t s^{\psi_i} \exp\{ \zeta_i \, s\} \, \mathrm{d}s, \] which could be viewed as a generalization of Weibull. Unfortunately, the implementation of JAGS does not cover this distribution.
For simplification, let us assume \(\alpha = 1\), which yields \(h_0(t) = 1\). This option usually results in exponential distribution (constant hazard function). However, the exponential term with \(t\) changes the distribution to something different. Conceptually, we have the (cumulative) hazard function of the form \[ h_i(t) = \exp\{a_i + t \, b_i\}, \qquad H_i(t) = \frac{\exp\{a_i\}}{b_i} \left( \exp\{b_i t\} - 1 \right), \] which is how Gompertz distribution is defined. Sadly, this distribution is not implemented in JAGS as well. However, since \(\log h_i(t)\) and \(H_i(t)\) can be expressed in closed formula, we can fit the model using zero-Poisson trick. We only need to supply our own implementation of the corresponding log-likelihood.
From NMST511
Course Notes by Theorem 2.2 under the assumption of censoring time
\(C_i\) independent of time to death
\(T_i\) the log-likelihood under
presence of right-censoring indicated by \(\delta_i = \mathbf{1}(T_i <= C_i)\)
takes the following form: \[
\ell (\boldsymbol{\theta})
= \mathrm{const.} +
\sum\limits_{i=1}^n \left[\delta_i \log h_i(X_i, \boldsymbol{\theta}) -
H_i(X_i, \boldsymbol{\theta})\right],
\] where \(X_i = \min\{T_i,
C_i\}\) is the event time (Time) and \(\boldsymbol{\theta}\) is the vector of
unknown parameters.
Consider a random variable \(O \sim \mathsf{Pois}(\phi)\) with \(\phi > 0\). Then, \(\mathsf{P}(O = 0) = \exp\{-\phi\}\) and the contribution to log-likelihood when \(O = 0\) is \(-\phi\). We just need to set \(-\phi\) to be the contribution to log-likelihood we desire. There is minor issue with the requirement that \(\phi\) has to be positive. If we set up \[ \phi = -\mathrm{loglik} + C, \] where \(C\) is sufficiently large constant to make (all) \(\phi\) positive. Shifting each contribution to log-likelihood by the same constant does not have any effect to it.
Alltogether, the pseudocode to be used in JAGS with right-censoring is below:
"model{
C <- 10^5
...
for(i in 1:n){
...
logh[i] <- ...
H[i] <- ...
loglik[i] <- delta[i] * logh[i] - H[i]
phi[i] <- C - loglik[i]
zeros[i] ~ dpois(phi[i])
}
...
}
"
Variable zeros is a vector of \(n\) zeros to be given as data. Constant
\(C\) can be given as data in advance
as well.
Extend the JAGS code from Exercise 3a by the survival model with Gompertz hazard function \(h_i(t) = \exp\{a_i + t \, b_i\}\) outlined above using the zero-Poisson trick.
Assume the independent block structure of the prior for model
parameters: \[
p(\boldsymbol{\beta}, \boldsymbol{\gamma}, \tau, \Omega)
=
\prod\limits_{j=1}^3 p(\beta_j) \, \prod\limits_{j=1}^2 p(\gamma_j) \,
p(\tau) \, p(\Omega)
\] Choose weakly informative normal prior for \(\beta_j\) and \(\gamma_j\), gamma prior for \(\tau\) and Wishart distribution
(dwish) for \(\Omega\).
Write down (and print) the model implementation within JAGS.
Sample two Markov chains using JAGS to approximate the posterior
distribution \(p(\boldsymbol{\beta}, \tau,
\Omega \,|\,\mathsf{data})\). Choose appropriate
burnin and thin by monitoring the trajectories
and autocorrelation.
Provide summaries including ET and HPD intervals for primary model parameters.
Explore (and plot) characteristics of the posterior distribution
(posterior mean/median and credible intervals) of the following
parametric functions: \(m_{\mathsf{new}}(t)\), \(h_{\mathsf{new}} (t)\), \(H_{\mathsf{new}} (t)\), \(S_{\mathsf{new}} (t)\) for two newly
observed patients with average evolution of CD4
each treated with different drug.