Coronary Artery Disease Diagnosis; Ranking the Significant Features Using a Random Trees Model

Heart disease is one of the most common diseases in middle-aged citizens. Among the vast number of heart diseases, the coronary artery disease (CAD) is considered as a common cardiovascular disease with a high death rate. The most popular tool for diagnosing CAD is the use of medical imaging, e.g., angiography. However, angiography is known for being costly and also associated with a number of side effects. Hence, the purpose of this study is to increase the accuracy of coronary heart disease diagnosis through selecting significant predictive features in order of their ranking. In this study, we propose an integrated method using machine learning. The machine learning methods of random trees (RTs), decision tree of C5.0, support vector machine (SVM), decision tree of Chi-squared automatic interaction detection (CHAID) are used in this study. The proposed method shows promising results and the study confirms that RTs model outperforms other models.

[1]  LeeAnn Kung,et al.  Leveraging Big Data Analytics to Improve Quality of Care in Healthcare Organizations: A Configurational Perspective , 2019, British Journal of Management.

[2]  Jafar Habibi,et al.  Diagnosis of Coronary Artery Disease Using Data Mining Techniques Based on Symptoms and ECG Features , 2012 .

[3]  Moloud Abdar,et al.  Performance analysis of classification algorithms on early detection of liver disease , 2017, Expert Syst. Appl..

[4]  Emilio Parrado-Hernández,et al.  Distributed support vector machines , 2006, IEEE Trans. Neural Networks.

[5]  Sulin Pang,et al.  C5.0 Classification Algorithm and Application on Individual Credit Evaluation of Banks , 2009 .

[6]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[7]  Jafar Habibi,et al.  Coronary artery disease detection using computational intelligence methods , 2016, Knowl. Based Syst..

[8]  Sagheer Abbas,et al.  Automated Diagnosis of Hepatitis B Using Multilayer Mamdani Fuzzy Inference System , 2019, Journal of healthcare engineering.

[9]  Shulin Wang,et al.  Feature selection in machine learning: A new perspective , 2018, Neurocomputing.

[10]  Chih-Jen Tseng,et al.  Application of machine learning to predict the recurrence-proneness for cervical cancer , 2013, Neural Computing and Applications.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Teh Ying Wah,et al.  Automated Diagnosis of Coronary Artery Disease: A Review and Workflow , 2018, Cardiology research and practice.

[14]  Saeid Nahavandi,et al.  Non-invasive detection of coronary artery disease in high-risk patients based on the stenosis prediction of separate coronary arteries , 2018, Comput. Methods Programs Biomed..

[15]  U. Rajendra Acharya,et al.  A new machine learning technique for an accurate diagnosis of coronary artery disease , 2019, Comput. Methods Programs Biomed..

[16]  Kasturi Dewi Varathan,et al.  Identification of significant features and data mining techniques in predicting heart disease , 2019, Telematics Informatics.

[17]  Roohallah Alizadehsani,et al.  Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm , 2017, Comput. Methods Programs Biomed..

[18]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[19]  I. Maqsood,et al.  Random Forests and Decision Trees , 2012 .

[20]  Biswajeet Pradhan,et al.  A novel integrated model for assessing landslide susceptibility mapping using CHAID and AHP pair-wise comparison , 2016 .

[21]  Anthony T. Chronopoulos,et al.  Feature weighting using a clustering approach , 2019 .

[22]  C. Dolea,et al.  World Health Organization , 1949, International Organization.

[23]  Ruo-Ping Han,et al.  Disease prediction with different types of neural network classifiers , 2016, Telematics Informatics.

[24]  Jafar Habibi,et al.  A data mining approach for diagnosis of coronary artery disease , 2013, Comput. Methods Programs Biomed..

[25]  Fahima A. Maghraby,et al.  Cervical Cancer Diagnosis Using Random Forest Classifier With SMOTE and Feature Reduction Techniques , 2018, IEEE Access.

[26]  Nazmul Huda,et al.  A Comparative Study of Bagging, Boosting and C4.5: The Recent Improvements in Decision Tree Learning Algorithm , 2010 .

[27]  Shahaboddin Shamshirband,et al.  Computer-aided decision-making for predicting liver disease using PSO-based optimized SVM with feature selection , 2019, Informatics in Medicine Unlocked.

[28]  Giancarlo Fortino,et al.  An Edge-Based Architecture to Support Efficient Applications for Healthcare Industry 4.0 , 2019, IEEE Transactions on Industrial Informatics.

[29]  Jian-Ping Li,et al.  A Hybrid Intelligent System Framework for the Prediction of Heart Disease Using Machine Learning Algorithms , 2018, Mob. Inf. Syst..

[30]  Amir Mosavi,et al.  Hybrid Machine Learning Model of Extreme Learning Machine Radial basis function for Breast Cancer Detection and Diagnosis; a Multilayer Fuzzy Expert System , 2019, 2020 RIVF International Conference on Computing and Communication Technologies (RIVF).

[31]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[32]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[33]  Francesca N. Delling,et al.  Heart Disease and Stroke Statistics—2018 Update: A Report From the American Heart Association , 2018, Circulation.

[34]  Mehul Motani,et al.  SURI: Feature Selection Based on Unique Relevant Information for Health Data , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[35]  Imas Sukaesih Sitanggang,et al.  Web-based Classification Application for Forest Fire Data Using the Shiny Framework and the C5.0 Algorithm , 2016 .

[36]  Rutvija Pandya,et al.  C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning , 2015 .

[37]  Saeid Nahavandi,et al.  Machine learning-based coronary artery disease diagnosis: A comprehensive review , 2019, Comput. Biol. Medicine.

[38]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[39]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[40]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.