Prediction models for estimation of survival rate and relapse for breast cancer patients

In this paper, we described the practical application of data mining methods for estimation of survival rate and disease relapse for breast cancer patients. A comparative study of prominent machine learning models was carried out and according to the achieved results we concluded that the classifiers obviously learn some of the concepts of breast cancer survivability and recurrence. These algorithms were successfully applied to a novel breast cancer data set of the Clinical Center of Kragujevac. The Naive Bayes classifier is selected as a model for prognosis of cancer survivability on the basis of the 5 years survival rate, while the Artificial Neural Network has achieved the best performance in prognosis of cancer recurrence. Selection of twenty attributes that are the most related to success of prognosis on survivability can give new insights into the set of prognostic factors which need to be observed by medical experts.

[1]  Mehmet Fatih Akay,et al.  Support vector machines combined with feature selection for breast cancer diagnosis , 2009, Expert Syst. Appl..

[2]  D. Dabbs,et al.  Erratum: Immunohistochemical Surrogate Markers of Breast Cancer Molecular Classes Predicts Response to Neoadjuvant Chemotherapy , 2011 .

[3]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[4]  M. Postma,et al.  Cancer incidence and mortality in Serbia 1999–2009 , 2013, BMC Cancer.

[5]  Erik Strumbelj,et al.  Explanation and reliability of prediction models: the case of breast cancer recurrence , 2010, Knowledge and Information Systems.

[6]  David S. Wishart,et al.  Applications of Machine Learning in Cancer Prediction and Prognosis , 2006, Cancer informatics.

[7]  Austin H. Chen,et al.  BCPP: An Intelligent Prediction System of Breast Cancer Prognosis Using Microarray and Clinical Data , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[8]  Bart De Moor,et al.  Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks , 2006, ISMB.

[9]  ParkKanghee,et al.  Robust predictive model for evaluating breast cancer survivability , 2013 .

[10]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[11]  Zoran Bosnic,et al.  ROC analysis of classifiers in machine learning: A survey , 2013, Intell. Data Anal..

[12]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[13]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Dursun Delen,et al.  Predicting breast cancer survivability: a comparison of three data mining methods , 2005, Artif. Intell. Medicine.

[15]  L. Freedman,et al.  The future of prognostic factors in outcome prediction for patients with cancer , 1992, Cancer.

[16]  Rodolfo Montironi,et al.  Prostate cancer outcome: epidemiology and biostatistics. , 2005, Analytical and quantitative cytology and histology.

[17]  Marko Robnik-Sikonja,et al.  An adaptation of Relief for attribute estimation in regression , 1997, ICML.

[18]  Hyunjung Shin,et al.  Robust predictive model for evaluating breast cancer survivability , 2013, Eng. Appl. Artif. Intell..

[19]  Li Liu,et al.  Improved breast cancer prognosis through the combination of clinical and genetic markers , 2007, Bioinform..

[20]  Christos Sotiriou,et al.  Bringing molecular prognosis and prediction to the clinic. , 2005, Clinical breast cancer.

[21]  D. Dabbs,et al.  Immunohistochemical surrogate markers of breast cancer molecular classes predicts response to neoadjuvant chemotherapy , 2010, Cancer.