Survival analysis models on time-dependent covariates -  bivariate failure-time model with one hidden failure.
Ladislav Pecen1, Krystof Eben2

1 International Clinical Research Center, St. Anne’s University Hospital, Brno, Czech Rep.,
2 Institute of Computer Science, Academy of Sciences, Prague, Czech Rep.

Covariates in survival analysis at oncology studies could be different type:

1. fix = known when subject enter study, e.g., TNM classification            
2. time dependent deterministic, e.g., patient’ age

3. time dependent stochastic covariates not directly measured on patient,  e.g., air pollution  

4. stochastic covariates measured on patient, e.g., tumor markers

An easiest variant how to deal with covariates ad 4. above is using of Cox regression model with time dependent covariates as implemented in SAS proc PHREG. Authors proposed more complex model described below.

Time dependent covariates are measured with error  Z(t) = Z*(t) + e(t),  where Z(t) measured, Z*(t) real value of covariate, e(t) observational error.  Risk function depends on Z*(t) but likelihood can be based only on observed data Z(t). When there are two options:

         stochastic risk functions - non-parametric models of real value of covariate Z*(t)

         parametric model for  Z*(t), e.g., “hockey-stick model”

                   Z*(t’) = m                          for t’ < t

                   Z*(t’) = m + g (t’ - t)         for t’ ³ t

 
The parametric “hockey/stick model” model was used and bivariate failure-time model with one hidden failure (defined at time t in then “hockey-stick model”) was proposed by authors. A motivation for this model development was a detection of tumor marker rise preceding disease recurrence and then an early detection of recurrence of cancer disease. Authors assume an existence of a period of (clinical) latency preceding recurrence of the disease. Latency is a hidden failure (for oncologist) - can be measured with help of tumor markers which goes up when latency occurs. There are cases when recurrence is not preceded by marker rise => the first failure may be either latency or recurrence.

As usually with incomplete data, authors have to base the inference on the marginal likelihood. As this is less tractable authors shall use the EM algorithm. This model was realized using SAS proc NLIN where partial derivations were calculated analytically as it will be presented in detail during author’ lecture.