Survival Regression with Accelerated Failure Time Model in XGBoost

Survival regression is used to estimate the relation between time-to-event and feature variables, and is important in application domains such as medicine, marketing, risk management and sales management. Nonlinear tree based machine learning algorithms as implemented in libraries such as XGBoost, scikit-learn, LightGBM, and CatBoost are often more accurate in practice than linear models. However, existing implementations of tree-based models have offered limited support for survival regression. In this work, we propose and implement loss functions for learning accelerated failure time (AFT) models in XGBoost, to increase the support for survival modeling for different kinds of label censoring. The AFT model assumes effects that directly accelerate or decelerate the survival time for different kinds of censored data sets. We demonstrate with real and simulated experiments the effectiveness of AFT in XGBoost with respect to a number of baselines, in two respects: generalization performance and training speed. Furthermore, we take advantage of the support for NVIDIA GPUs in XGBoost to achieve substantial speedup over multi-coreCPUs. To our knowledge, our work is the first implementation of AFT that utilizes the processing power of NVIDIA GPUs.

[1]  Dirk Van den Poel,et al.  Customer attrition analysis for financial services using proportional hazard models , 2004, Eur. J. Oper. Res..

[2]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[3]  A. Vigano,et al.  Survival prediction in terminal cancer patients: a systematic review of the medical literature , 2000, Palliative medicine.

[4]  Riccardo Miotto,et al.  Machine Learning to Predict Mortality and Critical Events in COVID-19 Positive New York City Patients , 2020, medRxiv.

[5]  Torsten Hothorn,et al.  Bagging survival trees , 2002, Statistics in medicine.

[6]  Bart Baesens,et al.  Time to default in credit scoring using survival analysis: a benchmark study , 2015, J. Oper. Res. Soc..

[7]  Harald Binder,et al.  Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models , 2008, BMC Bioinformatics.

[8]  Sy Han Chiou,et al.  Fitting Accelerated Failure Time Models in Routine Survival Analysis with R Package aftgee , 2014 .

[9]  Norman Breslow,et al.  Discussion of Professor Cox''s paper , 1974 .

[10]  P. Grambsch,et al.  A Package for Survival Analysis in S , 1994 .

[11]  Toby Hocking,et al.  Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning , 2016, Bioinform..

[12]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[13]  Z. Ying,et al.  On least-squares regression with censored data , 2006 .

[14]  Francis R. Bach,et al.  Learning Sparse Penalties for Change-point Detection using Max Margin Interval Regression , 2013, ICML.

[15]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[16]  Paul D. Allison,et al.  Survival analysis using sas®: a practical guide , 1995 .

[17]  Gian Antonio Susto,et al.  Machine Learning for Predictive Maintenance: A Multiple Classifier Approach , 2015, IEEE Transactions on Industrial Informatics.

[18]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[19]  Torsten Hothorn,et al.  Flexible boosting of accelerated failure time models , 2008, BMC Bioinformatics.

[20]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[21]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[22]  A. Barabadi,et al.  Application of accelerated failure model for the oil and gas industry in Arctic region , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[23]  Rong Ou,et al.  Out-of-Core GPU Gradient Boosting , 2020, ArXiv.

[24]  Scott M. Lundberg,et al.  Consistent Individualized Feature Attribution for Tree Ensembles , 2018, ArXiv.

[25]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[26]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[27]  Nassir Navab,et al.  Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients , 2016, F1000Research.

[28]  Ping Wang,et al.  Machine Learning for Survival Analysis , 2019, ACM Comput. Surv..

[29]  François Laviolette,et al.  Maximum Margin Interval Trees , 2017, NIPS.

[30]  Alfensi Faruk,et al.  The comparison of proportional hazards and accelerated failure time models in analyzing the first birth interval survival data , 2018 .

[31]  Jeffrey S Simonoff,et al.  Survival trees for interval‐censored survival data , 2017, Statistics in medicine.

[32]  Ida Scheel,et al.  Time-to-Event Prediction with Neural Networks and Cox Regression , 2019, J. Mach. Learn. Res..

[33]  Eibe Frank,et al.  Accelerating the XGBoost algorithm using GPU computing , 2017, PeerJ Comput. Sci..

[34]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[35]  D.,et al.  Regression Models and Life-Tables , 2022 .

[36]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[37]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[38]  I. James,et al.  Linear regression with censored data , 1979 .

[39]  Lee-Jen Wei,et al.  The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. , 1992, Statistics in medicine.

[40]  Bin Yu,et al.  On the Convergence of Boosting Procedures , 2003, ICML.

[41]  Ben Taskar,et al.  Efficient Second-Order Gradient Boosting for Conditional Random Fields , 2015, AISTATS.