Learning to rank for censored survival data

Survival analysis is a type of semi-supervised ranking task where the target output (the survival time) is often right-censored. Utilizing this information is a challenge because it is not obvious how to correctly incorporate these censored examples into a model. We study how three categories of loss functions, namely partial likelihood methods, rank methods, and our classification method based on a Wasserstein metric (WM) and the non-parametric Kaplan Meier estimate of the probability density to impute the labels of censored examples, can take advantage of this information. The proposed method allows us to have a model that predict the probability distribution of an event. If a clinician had access to the detailed probability of an event over time this would help in treatment planning. For example, determining if the risk of kidney graft rejection is constant or peaked after some time. Also, we demonstrate that this approach directly optimizes the expected C-index which is the most common evaluation metric for ranking survival models.

[1]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[2]  J. Kalbfleisch Non‐Parametric Bayesian Analysis of Survival Time Data , 1978 .

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[5]  Christopher Joseph Pal,et al.  Unimodal Probability Distributions for Deep Ordinal Classification , 2017, ICML.

[6]  Dimitris Samaras,et al.  Squared Earth Mover's Distance-based Loss for Training Deep Neural Networks , 2016, ArXiv.

[7]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[8]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[9]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[10]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[11]  Hossein Mobahi,et al.  Learning with a Wasserstein Loss , 2015, NIPS.

[12]  Peter J. Bickel,et al.  The Earth Mover's distance is the Mallows distance: some insights from statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  D.,et al.  Regression Models and Life-Tables , 2022 .

[14]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[15]  Yoshua Bengio,et al.  Deep Learning for Patient-Specific Kidney Graft Survival Analysis , 2017, ArXiv.

[16]  Uri Shaham,et al.  DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network , 2016, BMC Medical Research Methodology.

[17]  Scott W. Linderman,et al.  Learning Latent Permutations with Gumbel-Sinkhorn Networks , 2018, ICLR.

[18]  Balaji Krishnapuram,et al.  On Ranking in Survival Analysis: Bounds on the Concordance Index , 2007, NIPS.

[19]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.