Fuzzy decision tree for breast cancer prediction

Medical errors are considered as the leading cause of death and injury. Breast cancer becomes one of the leading causes of death among women, not only in the Philippines but worldwide. In this paper, data mining was used to predict the stage of breast cancer using a hybrid of fuzzy logic and decision tree. This aims to help experts to make decisions rather than replacing them. The result will only give an expert a recommendation, but the final decision is still on the hands of the experts. Feature selection was used to determine the best attribute in the dataset from Surveillance Epidemiology and End Results (SEER). The data set consists of incidence from 1975 to 2016, but the study limits the analysis from 2010 to 2016. Different cleaning and preprocessing of data are conducted. After thorough preprocessing of data, six (6) attributes are selected, and one (1) target class. Performance comparison shows that the fuzzy decision tree achieved a higher accuracy of 99.96%, sensitivity of 99.26% and specificity of 99.98% than the decision tree classification technique. The simulation result shows a correctly classified instance of 165,124, which is equivalent to 99.97% and only 351 incorrect classified instances or 0.21%. Thus, a fuzzy decision tree is more robust than the traditional decision tree classifier for predicting the stage of breast cancer.

[1]  Yifei Zhang,et al.  An Automated Strategy for Early Risk Identification of Sudden Cardiac Death by Using Machine Learning Approach on Measurable Arrhythmic Risk Markers , 2019, IEEE Access.

[2]  E. Doherty,et al.  What about doctors? The impact of medical errors. , 2014, The surgeon : journal of the Royal Colleges of Surgeons of Edinburgh and Ireland.

[3]  Catarina Eloy,et al.  Automatic classification of tissue malignancy for breast carcinoma diagnosis , 2018, Comput. Biol. Medicine.

[4]  Bartolome T. Tanguilig,et al.  Patient Diagnosis of Breast Cancer Using Rule-Based Fuzzy Algorithm for Decision Support System , 2016 .

[5]  Ramchandra G Pawar DATA CLASSIFICATION OF STUDENT PERCEPTION ANALYSIS BASED ONNAIVE BAYES AND J48 ALGORITHM , 2016 .

[6]  Amit Chhabra,et al.  Improved J48 Classification Algorithm for the Prediction of Diabetes , 2014 .

[7]  Reza Safdari,et al.  Design a Fuzzy Rule-based Expert System to Aid Earlier Diagnosis of Gastric Cancer , 2018, Acta informatica medica : AIM : journal of the Society for Medical Informatics of Bosnia & Herzegovina : casopis Drustva za medicinsku informatiku BiH.

[8]  Ali Idri,et al.  Cardiovascular Dysautonomias Diagnosis Using Crisp and Fuzzy Decision Tree: A Comparative Study , 2016, eHealth.

[9]  Sushilkumar Rameshpant Kalmegh,et al.  Comparative Analysis of WEKA Data Mining Algorithm RandomForest, RandomTree and LADTree for Classification of Indigenous News Data , 2015 .

[10]  Mehrbakhsh Nilashi,et al.  Diseases diagnosis using fuzzy logic methods: A systematic and meta-analysis review , 2018, Comput. Methods Programs Biomed..

[11]  K. Supraja,et al.  Robust fuzzy rule based technique to detect frauds in vehicle insurance , 2017, 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS).

[12]  M. Graber The incidence of diagnostic error in medicine , 2013, BMJ quality & safety.

[13]  J. M. Rodriguez,et al.  Clinical reasoning for the infectious disease specialist: a primer to recognize cognitive biases. , 2013, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[14]  S. Pal,et al.  Prediction of benign and malignant breast cancer using data mining techniques , 2018 .

[15]  Hajar Mousannif,et al.  Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis , 2016, ANT/SEIT.

[16]  S Prabaharan,et al.  AN EFFECTIVE PREDICTION ANALYSIS USING J48 , 2015 .

[17]  T. Velmurugan,et al.  Analyzing Diabetic Data using Classification Algorithms in Data Mining , 2016 .

[18]  M. Wakefield To err is human: An Institute of Medicine report. , 2000 .

[19]  M. Makary,et al.  Medical error—the third leading cause of death in the US , 2016, British Medical Journal.

[20]  Shiv Shakti Shrivastava,et al.  An Overview on Data Mining Approach on Breast Cancer data , 2013 .

[21]  David L. B. Schwappach,et al.  The epidemiology of medical errors: A review of the literature , 2003, Wiener Klinische Wochenschrift.

[22]  Hyunjung Shin,et al.  wFDT - Weighted Fuzzy Decision Trees for Prognosis of Breast Cancer Survivability , 2008, AusDM.

[23]  Tina R. Patil,et al.  Performance Analysis of Naive Bayes and J 48 Classification Algorithm for Data Classification , 2013 .

[24]  Madhu Kumari,et al.  Breast Cancer Prediction system , 2018 .

[25]  C. D'Orsi,et al.  Defense of Breast Cancer Malpractice Claims , 2001, The breast journal.

[26]  M. H. Fazel Zarandi,et al.  Fuzzy rule based expert system for diagnosis of multiple sclerosis , 2014, 2014 IEEE Conference on Norbert Wiener in the 21st Century (21CW).

[27]  Aman Paul Prediction of Blood Donors" Population using Data Mining Classification Technique , 2014 .

[28]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[29]  Bobby D. Gerardo,et al.  A Rule-Based Fuzzy Diagnostics Decision Support System for Tuberculosis , 2011, 2011 Ninth International Conference on Software Engineering Research, Management and Applications.

[30]  Mahmoud Omid,et al.  Design of an expert system for sorting pistachio nuts through decision tree and fuzzy logic classifier , 2011, Expert Syst. Appl..