Predicting prostate cancer recurrence via maximizing the concordance index

In order to effectively use machine learning algorithms, e.g., neural networks, for the analysis of survival data, the correct treatment of censored data is crucial. The concordance index (CI) is a typical metric for quantifying the predictive ability of a survival model. We propose a new algorithm that directly uses the CI as the objective function to train a model, which predicts whether an event will eventually occur or not. Directly optimizing the CI allows the model to make complete use of the information from both censored and non-censored observations. In particular, we approximate the CI via a differentiable function so that gradient-based methods can be used to train the model. We applied the new algorithm to predict the eventual recurrence of prostate cancer following radical prostatectomy. Compared with the traditional Cox proportional hazards model and several other algorithms based on neural networks and support vector machines, our algorithm achieves a significant improvement in being able to identify high-risk and low-risk groups of patients.

[1]  Ivan Bratko,et al.  Machine Learning for Survival Analysis: A Case Study on Recurrence of Prostate Cancer , 1999, AIMDM.

[2]  J R Beck,et al.  Experiments to determine whether recursive partitioning (CART) or an artificial neural network overcomes theoretical limitations of Cox proportional hazards regression. , 1998, Computers and biomedical research, an international journal.

[3]  Michael C. Mozer,et al.  Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic , 2003, ICML.

[4]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[5]  J. Klein,et al.  Survival Analysis: Techniques for Censored and Truncated Data , 1997 .

[6]  W. Catalona,et al.  Artificial neural networks in the diagnosis and prognosis of prostate cancer: a pilot study. , 1994, The Journal of urology.

[7]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[8]  Henrik Grönberg,et al.  Prostate cancer epidemiology , 2003, The Lancet.

[9]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[10]  E. Davidov,et al.  Advancing drug discovery through systems biology. , 2003, Drug discovery today.

[11]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[12]  F. Harrell,et al.  Artificial neural networks improve the accuracy of cancer survival prediction , 1997, Cancer.

[13]  M. Kattan,et al.  Postoperative nomogram for disease recurrence after radical prostatectomy for prostate cancer. , 1999, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[14]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[15]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[16]  Mark A. Musen,et al.  Modular Neural Networks for Medical Prognosis: Quantifying the Benefits of Combining Neural Networks for Survival Prediction , 1997, Connect. Sci..

[17]  E Biganzoli,et al.  Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. , 1998, Statistics in medicine.

[18]  William Moran,et al.  On the use of artificial neural networks for the analysis of survival data , 1997, IEEE Trans. Neural Networks.

[19]  L. Hood Systems biology: integrating technology, biology, and computation , 2003, Mechanisms of Ageing and Development.

[20]  Robert L. Sutherland,et al.  Validation Study of the Accuracy of a Postoperative Nomogram for Recurrence After Radical Prostatectomy for Localized Prostate Cancer , 2002 .

[21]  Yianni Attikiouzel,et al.  Artificial Neural Networks and Breast Cancer Prognosis , 1994, Aust. Comput. J..