Survival analysis with time‐dependent covariates subject to missing data or measurement error: Multiple Imputation for Joint Modeling (MIJM)

SUMMARY Modern epidemiological studies collect data on time‐varying individual‐specific characteristics, such as body mass index and blood pressure. Incorporation of such time‐dependent covariates in time‐to‐event models is of great interest, but raises some challenges. Of specific concern are measurement error, and the non‐synchronous updating of covariates across individuals, due for example to missing data. It is well known that in the presence of either of these issues the last observation carried forward (LOCF) approach traditionally used leads to bias. Joint models of longitudinal and time‐to‐event outcomes, developed recently, address these complexities by specifying a model for the joint distribution of all processes and are commonly fitted by maximum likelihood or Bayesian approaches. However, the adequate specification of the full joint distribution can be a challenging modeling task, especially with multiple longitudinal markers. In fact, most available software packages are unable to handle more than one marker and offer a restricted choice of survival models. We propose a two‐stage approach, Multiple Imputation for Joint Modeling (MIJM), to incorporate multiple time‐dependent continuous covariates in the semi‐parametric Cox and additive hazard models. Assuming a primary focus on the time‐to‐event model, the MIJM approach handles the joint distribution of the markers using multiple imputation by chained equations, a computationally convenient procedure that is widely available in mainstream statistical software. We developed an R package “survtd” that allows MIJM and other approaches in this manuscript to be applied easily, with just one call to its main function. A simulation study showed that MIJM performs well across a wide range of scenarios in terms of bias and coverage probability, particularly compared with LOCF, simpler two‐stage approaches, and a Bayesian joint model. The Framingham Heart Study is used to illustrate the approach.

[1]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[2]  Paul S Albert,et al.  On Estimating the Relationship between Longitudinal Measurements and Time‐to‐Event Data Using a Simple Two‐Stage Procedure , 2009, Biometrics.

[3]  J. M. Taylor,et al.  A comparison of smoothing techniques for CD4 data measured with error in a time-dependent Cox proportional hazards model. , 1998, Statistics in medicine.

[4]  Dimitris Rizopoulos,et al.  A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time‐to‐event , 2011, Statistics in medicine.

[5]  H. Boshuizen,et al.  Multiple imputation of missing blood pressure covariates in survival analysis. , 1999, Statistics in medicine.

[6]  A. Dreher Modeling Survival Data Extending The Cox Model , 2016 .

[7]  David J. Spiegelhalter,et al.  Bayesian graphical modelling: a case‐study in monitoring health outcomes , 2002 .

[8]  Dimitris Rizopoulos,et al.  The R Package JMbayes for Fitting Joint Models for Longitudinal and Time-to-Event Data using MCMC , 2014, 1404.7625.

[9]  Roger A. Sugden,et al.  Multiple Imputation for Nonresponse in Surveys , 1988 .

[10]  Joseph G Ibrahim,et al.  Joint modeling of survival and longitudinal non‐survival data: current methods and issues. Report of the DIA Bayesian joint modeling working group , 2015, Statistics in medicine.

[11]  M. Wulfsohn,et al.  A joint model for survival and longitudinal data measured with error. , 1997, Biometrics.

[12]  Zhiliang Ying,et al.  Semiparametric analysis of the additive risk model , 1994 .

[13]  D. Thomas,et al.  Simultaneously modelling censored survival data and repeatedly measured covariates: a Gibbs sampling approach. , 1996, Statistics in medicine.

[14]  K. Liestøl,et al.  Attenuation caused by infrequently updated covariates in survival analysis. , 2003, Biostatistics.

[15]  Rory Wolfe,et al.  The number of years lived with obesity and the risk of all-cause and cause-specific mortality. , 2011, International journal of epidemiology.

[16]  Joseph G Ibrahim,et al.  Joint Models for Multivariate Longitudinal and Multivariate Survival Data , 2006, Biometrics.

[17]  H. Sintonen,et al.  Decomposition of health inequality by determinants and dimensions. , 2007, Health economics.

[18]  Wei Liu,et al.  Analysis of Longitudinal and Survival Data: Joint Modeling, Inference Methods, and Issues , 2012 .

[19]  Graeme L. Hickey,et al.  Joint modelling of time-to-event and multivariate longitudinal outcomes: recent developments and issues , 2016, BMC Medical Research Methodology.

[20]  A. Rotnitzky,et al.  Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis by DANIELS, M. J. and HOGAN, J. W , 2009 .

[21]  V. De Gruttola,et al.  Modelling progression of CD4-lymphocyte count and its relationship to survival time. , 1994, Biometrics.

[22]  M Chavance,et al.  Sensitivity analysis of incomplete longitudinal data departing from the missing at random assumption: Methodology and application in a clinical trial with drop-outs , 2016, Statistical methods in medical research.

[23]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[24]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[25]  Xihong Lin,et al.  Semiparametric Modeling of Longitudinal Measurements and Time‐to‐Event Data–A Two‐Stage Regression Calibration Approach , 2008, Biometrics.

[26]  Wesley O Johnson,et al.  Predictive comparison of joint longitudinal-survival modeling: a case study illustrating competing approaches , 2011, Lifetime data analysis.

[27]  Dan Jackson,et al.  What Is Meant by "Missing at Random"? , 2013, 1306.2812.

[28]  Nicola J Cooper,et al.  Predicting costs over time using Bayesian Markov chain Monte Carlo methods: an application to early inflammatory polyarthritis. , 2007, Health economics.

[29]  Wenbin Lu,et al.  A Semiparametric Marginalized Model for Longitudinal Data with Informative Dropout. , 2012, Journal of probability and statistics.

[30]  S L Zeger,et al.  The Evaluation of Multiple Surrogate Endpoints , 2001, Biometrics.