Prediction of Breast Cancer Survival Through Knowledge Discovery in Databases

The collection of large volumes of medical data has offered an opportunity to develop prediction models for survival by the medical research community. Medical researchers who seek to discover and extract hidden patterns and relationships among large number of variables use knowledge discovery in databases (KDD) to predict the outcome of a disease. The study was conducted to develop predictive models and discover relationships between certain predictor variables and survival in the context of breast cancer. This study is Cross sectional. After data preparation, data of 22,763 female patients, mean age 59.4 years, stored in the Surveillance Epidemiology and End Results (SEER) breast cancer dataset were analyzed anonymously. IBM SPSS Statistics 16, Access 2003 and Excel 2003 were used in the data preparation and IBM SPSS Modeler 14.2 was used in the model design. Support Vector Machine (SVM) model outperformed other models in the prediction of breast cancer survival. Analysis showed SVM model detected ten important predictor variables contributing mostly to prediction of breast cancer survival. Among important variables, behavior of tumor as the most important variable and stage of malignancy as the least important variable were identified. In current study, applying of the knowledge discovery method in the breast cancer dataset predicted the survival condition of breast cancer patients with high confidence and identified the most important variables participating in breast cancer survival.

[1]  Dharminder Kumar,et al.  DATA MINING CLASSIFICATION TECHNIQUES APPLIED FOR BREAST CANCER DIAGNOSIS AND PROGNOSIS , 2011 .

[2]  F. Harrell,et al.  Artificial neural networks improve the accuracy of cancer survival prediction , 1997, Cancer.

[3]  C. Callaway,et al.  Review of A Large Clinical Series: Coronary Angiography Predicts Improved Outcome Following Cardiac Arrest: Propensity-adjusted Analysis , 2009, Journal of intensive care medicine.

[4]  M Fieschi,et al.  Medical Decision Support Systems: Old Dilemmas and new Paradigms? , 2003, Methods of Information in Medicine.

[5]  A Ziegler,et al.  Data Analysis and Data Mining: Current Issues in Biomedical Informatics , 2011, Methods of Information in Medicine.

[6]  Dursun Delen,et al.  Predicting breast cancer survivability: a comparison of three data mining methods , 2005, Artif. Intell. Medicine.

[7]  Erhan Guven,et al.  PREDICTING BREAST CANCER SURVIVABILITY USING DATA MINING TECHNIQUES , 2006 .

[8]  Ali Tufail,et al.  Analyzing Potential of SVM Based Classifiers for Intelligent and Less Invasive Breast Cancer Prognosis , 2010, 2010 Second International Conference on Computer Engineering and Applications.

[9]  Lei Lei,et al.  A Review of Missing Data Treatment Methods , 2005 .

[10]  Pamela E Windle Data mining: an excellent research tool. , 2004, Journal of perianesthesia nursing : official journal of the American Society of PeriAnesthesia Nurses.

[11]  R. Asadollahi,et al.  Anxiety, Depression and Anger in Breast Cancer Patients Compared with the General Population in Shiraz, Southern Iran , 2009 .

[12]  Paul D. Williams,et al.  Data mining in genomics. , 2008, Clinics in laboratory medicine.

[13]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[14]  David S. Wishart,et al.  Applications of Machine Learning in Cancer Prediction and Prognosis , 2006, Cancer informatics.

[15]  Nosrat Shahsavar,et al.  Predicting Metastasis in Breast Cancer: Comparing a Decision Tree with Domain Experts , 2007, Journal of Medical Systems.

[16]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[17]  Shahpar Haghighat,et al.  Survival Rate of Breast Cancer Based on Geographical Variation in Iran, a National Study , 2012, Iranian Red Crescent medical journal.

[18]  J. M. Jerez,et al.  Improvement of breast cancer relapse prediction in high risk intervals using artificial neural networks , 2005, Breast Cancer Research and Treatment.

[19]  Yanchun Zhang,et al.  Toward breast cancer survivability prediction models through improving training space , 2009, Expert Syst. Appl..

[20]  A. A. Safavi,et al.  Predicting breast cancer survivability using data mining techniques , 2010, 2010 2nd International Conference on Software Technology and Engineering.

[21]  Krzysztof J. Cios,et al.  Uniqueness of medical data mining , 2002, Artif. Intell. Medicine.

[22]  Matteo Magnani,et al.  Techniques for Dealing with Missing Data in Knowledge Discovery Tasks , 2004 .

[23]  Hiroshi Tanaka,et al.  Predicting Breast Cancer Survivability : Comparison of Five Data Mining Techniques , 2007 .