Bayesian variable selection and estimation in semiparametric joint models of multivariate longitudinal and survival data

This paper presents a novel semiparametric joint model for multivariate longitudinal and survival data (SJMLS) by relaxing the normality assumption of the longitudinal outcomes, leaving the baseline hazard functions unspecified and allowing the history of the longitudinal response having an effect on the risk of dropout. Using Bayesian penalized splines to approximate the unspecified baseline hazard function and combining the Gibbs sampler and the Metropolis-Hastings algorithm, we propose a Bayesian Lasso (BLasso) method to simultaneously estimate unknown parameters and select important covariates in SJMLS. Simulation studies are conducted to investigate the finite sample performance of the proposed techniques. An example from the International Breast Cancer Study Group (IBCSG) is used to illustrate the proposed methodologies.

[1]  Gang Li,et al.  Robust Joint Modeling of Longitudinal Measurements and Competing Risks Failure Time Data , 2009, Biometrical journal. Biometrische Zeitschrift.

[2]  Chi-Hong Tseng,et al.  Joint analysis of bivariate longitudinal ordinal outcomes and competing risks survival times with nonparametric distributions for random effects , 2012, Statistics in medicine.

[3]  Joseph G Ibrahim,et al.  Joint Models for Multivariate Longitudinal and Multivariate Survival Data , 2006, Biometrics.

[4]  Geert Verbeke,et al.  Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data , 2009 .

[5]  Xiao Song,et al.  Semiparametric Approaches for Joint Modeling of Longitudinal and Survival Data with Time‐Varying Coefficients , 2008, Biometrics.

[6]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[7]  V. De Gruttola,et al.  Modelling progression of CD4-lymphocyte count and its relationship to survival time. , 1994, Biometrics.

[8]  Yangxin Huang,et al.  Jointly modeling time-to-event and longitudinal data: a Bayesian approach , 2014, Stat. Methods Appl..

[9]  Torsten Hothorn,et al.  A Framework for Unbiased Model Selection Based on Boosting , 2011 .

[10]  Hongtu Zhu,et al.  Bayesian Lasso for Semiparametric Structural Equation Models , 2012, Biometrics.

[11]  B. Peter BOOSTING FOR HIGH-DIMENSIONAL LINEAR MODELS , 2006 .

[12]  Xiao-Li Meng,et al.  Posterior Predictive Assessment of Model Fitnessvia Realized , 1995 .

[13]  H. Ishwaran,et al.  Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models , 2000 .

[14]  S. Lang,et al.  Bayesian P-Splines , 2004 .

[15]  M Ganjali,et al.  Bayesian Joint Modeling of Longitudinal Measurements and Time-to-Event Data Using Robust Distributions , 2014, Journal of biopharmaceutical statistics.

[16]  Chris Hans Bayesian lasso regression , 2009 .

[17]  Torsten Hothorn,et al.  Variable selection and model choice in structured survival models , 2013, Comput. Stat..

[18]  Gang Li,et al.  A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data , 2007, Statistics in medicine.

[19]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[20]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[21]  Hongtu Zhu,et al.  Bayesian Influence Measures for Joint Models for Longitudinal and Survival Data , 2012, Biometrics.

[22]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Nian-Sheng Tang,et al.  ANALYSIS OF NONLINEAR STRUCTURAL EQUATION MODELS WITH NONIGNORABLE MISSING COVARIATES AND ORDERED CATEGORICAL DATA , 2006 .

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Hongtu Zhu,et al.  Bayesian estimation of semiparametric nonlinear dynamic factor analysis models using the Dirichlet process prior. , 2011, The British journal of mathematical and statistical psychology.

[26]  H. Akaike A new look at the statistical model identification , 1974 .

[27]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[28]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[29]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[30]  Yan Wang,et al.  Jointly Modeling Longitudinal and Event Time Data With Application to Acquired Immunodeficiency Syndrome , 2001 .

[31]  M. Wulfsohn,et al.  Modeling the Relationship of Survival to Longitudinal Data Measured with Error. Applications to Survival and CD4 Counts in Patients with AIDS , 1995 .

[32]  Wanzhu Tu,et al.  Simultaneous variable selection for joint models of longitudinal and survival outcomes , 2015, Biometrics.

[33]  P. Müller,et al.  Bayesian curve fitting using multivariate normal mixtures , 1996 .

[34]  Bani K. Mallick,et al.  Gene selection using a two-level hierarchical Bayesian model , 2004, Bioinform..

[35]  D J Spiegelhalter,et al.  Flexible random‐effects models using Bayesian semi‐parametric models: applications to institutional comparisons , 2007, Statistics in medicine.

[36]  Damon Berridge,et al.  Robust joint modeling of longitudinal measurements and time to event data using normal/independent distributions: A Bayesian approach , 2013, Biometrical journal. Biometrische Zeitschrift.

[37]  Joseph G Ibrahim,et al.  Basic concepts and methods for joint models of longitudinal and survival data. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[38]  W. R. Schucany,et al.  Generating Random Variates Using Transformations with Multiple Roots , 1976 .

[39]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[40]  Rizopoulos Dimitris,et al.  Joint Modeling of Longitudinal and Time-to-Event Data , 2014 .

[41]  Ioannis Ntzoufras,et al.  On Bayesian lasso variable selection and the specification of the shrinkage parameter , 2012, Stat. Comput..

[42]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[43]  Yangxin Huang,et al.  Bayesian inference on joint models of HIV dynamics for time‐to‐event and longitudinal data with skewness and covariate measurement errors , 2011, Statistics in medicine.

[44]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[45]  Dimitris Rizopoulos,et al.  A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time‐to‐event , 2011, Statistics in medicine.

[46]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[47]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[48]  Nian-Sheng Tang,et al.  Semiparametric Bayesian joint models of multivariate longitudinal and survival data , 2014, Comput. Stat. Data Anal..

[49]  Bradley P. Carlin,et al.  Combining Dynamic Predictions From Joint Models for Longitudinal and Time-to-Event Data Using Bayesian Model Averaging , 2013, 1303.2797.