Variational Inference for Gaussian Process Models for Survival Analysis

Gaussian process survival analysis model (GPSAM) was recently proposed to address key deficiencies of the Cox proportional hazard model, namely the need to account for uncertainty in the hazard function modeling while, at the same time, relaxing the time-covariates factorized assumption of the Cox model. However, the existing MCMC inference algorithms for GPSAM have proven to be slow in practice. In this paper we propose novel and scalable variational inference algorithms for GPSAM that reduce the time complexity of the sampling approaches and improve scalability to large datasets. We accomplish this by employing two effective strategies in scalable GP: i) using pseudo inputs and ii) approximation via random feature expansions. In both setups, we derive the full and partial likelihood formulations, typically considered in survival analysis settings. The proposed approaches are evaluated on two clinical and a divorce-marriage benchmark datasets, where we demonstrate improvements in prediction accuracy over the existing survival analysis methods, while reducing the complexity of inference compared to the recent state-of-the-art MCMC-based algorithms.

[1]  A. Dreher Modeling Survival Data Extending The Cox Model , 2016 .

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Sabine Van Huffel,et al.  Support vector methods for survival analysis: a comparison between ranking and regression approaches , 2011, Artif. Intell. Medicine.

[4]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[5]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[6]  Wesley O Johnson,et al.  Bayesian Nonparametric Nonproportional Hazards Survival Modeling , 2009, Biometrics.

[7]  Ryan P. Adams,et al.  Tractable nonparametric Bayesian inference in Poisson processes with Gaussian process intensities , 2009, ICML '09.

[8]  H. Rue,et al.  Approximate Bayesian Inference for Survival Models , 2010 .

[9]  Yee Whye Teh,et al.  Gaussian Processes for Survival Analysis , 2016, NIPS.

[10]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[11]  Adler J. Perotte,et al.  Deep Survival Analysis , 2016, MLHC.

[12]  Wei Chu,et al.  A Support Vector Approach to Censored Targets , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[13]  James M. Rehg,et al.  iSurvive: An Interpretable, Event-time Prediction Model for mHealth , 2017, ICML.

[14]  G. Casella,et al.  Rao-Blackwellisation of sampling schemes , 1996 .

[15]  Sabine Van Huffel,et al.  Learning Transformation Models for Ranking and Survival Analysis , 2011, J. Mach. Learn. Res..

[16]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[17]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[18]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[19]  Edwin V. Bonilla,et al.  Scalable Inference for Gaussian Process Models with Black-Box Likelihoods , 2015, NIPS.

[20]  Stephen J. Roberts,et al.  Variational Inference for Gaussian Process Modulated Poisson Processes , 2014, ICML.

[21]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[22]  R. Prentice A LOG GAMMA MODEL AND ITS MAXIMUM LIKELIHOOD ESTIMATION , 1974 .

[23]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[24]  D. Kleinbaum,et al.  Survival Analysis: A Self-Learning Text. , 1996 .

[25]  Faisal M. Khan,et al.  Support Vector Regression for Censored Data (SVRc): A Novel Tool for Survival Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[26]  D. Blei Bayesian Nonparametrics I , 2016 .