Deep Cox Mixtures for Survival Regression

ABSTRACT Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for thismodel that does hard assignments tomixture groups tomake optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.

[1]  Chin-Tsang Chiang,et al.  Optimal Composite Markers for Time‐Dependent Receiver Operating Characteristic Curves with Censored Survival Data , 2010 .

[2]  M. Tanner,et al.  Mixtures of proportional hazards regression models. , 1999, Statistics in medicine.

[3]  Peter C Austin,et al.  Graphical calibration curves and the integrated calibration index (ICI) for survival models , 2020, Statistics in medicine.

[4]  Paul B Tchounwou,et al.  Health and Racial Disparity in Breast Cancer. , 2019, Advances in experimental medicine and biology.

[5]  M. Pencina,et al.  On the C‐statistics for evaluating overall adequacy of risk prediction procedures with censored survival data , 2011, Statistics in medicine.

[6]  Brendan T. O'Connor,et al.  Posterior calibration and exploratory analysis for natural language processing models , 2015, EMNLP.

[7]  Tianxi Cai,et al.  Evaluating Prediction Rules for t-Year Survivors With Censored Regression Models , 2007 .

[8]  Lawrence Carin,et al.  Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization , 2016, AISTATS.

[9]  Bhiksha Raj,et al.  Nonlinear Semi-Parametric Models for Survival Analysis , 2019, ArXiv.

[10]  D Faraggi,et al.  A neural network model for survival data. , 1995, Statistics in medicine.

[11]  R. Kolamunnage-Dona,et al.  Time-dependent ROC curve analysis in medical research: current methods and applications , 2017, BMC Medical Research Methodology.

[12]  Sarah Kaakai,et al.  Ethical and social implications of approaching death prediction in humans - when the biology of ageing meets existential issues , 2020, BMC medical ethics.

[13]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[14]  Thomas A Gerds,et al.  Estimating a time‐dependent concordance index for survival prediction models with covariate dependent censoring , 2013, Statistics in medicine.

[15]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[16]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[17]  D.,et al.  Regression Models and Life-Tables , 2022 .

[18]  Changhee Lee,et al.  DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks , 2018, AAAI.

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20]  T. Therneau,et al.  Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. , 2012, Mayo Clinic proceedings.

[21]  Artur Dubrawski,et al.  Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data With Competing Risks , 2020, IEEE Journal of Biomedical and Health Informatics.

[22]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[23]  Lawrence Carin,et al.  Survival cluster analysis , 2020, CHIL.

[24]  Ahmed M. Alaa,et al.  Temporal Quilting for Survival Analysis , 2019, AISTATS.

[25]  C. Czado,et al.  Application of survival analysis methods to long-term care insurance , 2002 .

[26]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[27]  Lu Tian,et al.  A Calibration Metric for Risk Scores with Survival Data , 2019, MLHC.

[28]  S. Basu,et al.  Clinical Implications of Revised Pooled Cohort Equations for Estimating Atherosclerotic Cardiovascular Disease Risk , 2018, Annals of Internal Medicine.

[29]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[30]  Christine B Ambrosone,et al.  Diagnosis and surgical delays in African American and white women with early-stage breast cancer. , 2015, Journal of women's health.

[31]  D. Lin,et al.  On the Breslow estimator , 2007, Lifetime data analysis.

[32]  秀樹 林谷,et al.  The life table and its applications , 1995 .

[33]  Ida Scheel,et al.  Time-to-Event Prediction with Neural Networks and Cox Regression , 2019, J. Mach. Learn. Res..

[34]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[35]  Uri Shaham,et al.  DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network , 2016, BMC Medical Research Methodology.

[36]  Jeremy Nixon,et al.  Measuring Calibration in Deep Learning , 2019, CVPR Workshops.

[37]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[38]  Jennifer G. Robinson,et al.  2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. , 2014, Circulation.

[39]  M. Schumacher,et al.  Consistent Estimation of the Expected Brier Score in General Survival Models with Right‐Censored Event Times , 2006, Biometrical journal. Biometrische Zeitschrift.

[40]  Sunita Sarawagi,et al.  Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings , 2018, ICML.

[41]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[42]  Changhee Lee,et al.  Dynamic-DeepHit: A Deep Learning Approach for Dynamic Survival Analysis With Competing Risks Based on Longitudinal Data , 2020, IEEE Transactions on Biomedical Engineering.

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Ricardo Henao,et al.  Variational learning of individual survival distributions , 2020, CHIL.