Ensemble of heterogeneous classifiers for diagnosis and prediction of coronary artery disease with reduced feature subset

BACKGROUND AND OBJECTIVE Coronary artery disease (CAD) is considered one of the most prominent health issues causing high mortality in the world population. Hence, earlier diagnosis and prediction of CAD is essential for the proper medication of patients. The objective of this study is to develop a machine learning algorithm that will help in accurate diagnosis of CAD. METHODS In this paper, we have proposed a novel heterogeneous ensemble method combining three base classifiers viz., K-Nearest Neighbour, Random Forest, and Support Vector Machine for effective diagnosis of CAD. The results of base classifiers are combined using ensemble voting technique based on average-voting (AVEn), majority-voting (MVEn), and weighted-average voting (WAVEn) for prediction of CAD. The random forest-based Boruta wrapper feature selection algorithm and feature importance of SVM are used for relevant feature selection based on attribute importance and rank. RESULTS The proposed ensemble algorithm is developed using 5 features selected based on the feature importance and the performance of the algorithm is evaluated using the Z-Alizadeh Sani dataset. Further, the dataset is balanced using Synthetic Minority Over-sampling Technique and its performance is evaluated. The result analysis shows that the WAVEn algorithm achieves better classification accuracy, sensitivity, specificity and precision of 98.97%, 100%, 96.3% and 98.3% respectively for the original dataset. The WAVEn algorithm applied on the balanced dataset achieves 100% accuracy, sensitivity, specificity and precision in diagnosing CAD. To the best of author's knowledge, the accuracy achieved by WAVEn is the highest accuracy when compared with the state-of-the-art algorithms in the literature for both original and balanced dataset. CONCLUSIONS The statistical results prove the robustness of the WAVEn algorithm in reliably discriminating the CAD patients from healthy ones with high precision, and therefore it can be used for developing a decision support system for diagnosing CAD at an early stage.

[1]  Moloud Abdar,et al.  A Novel Effective Ensemble Model for Early Detection of Coronary Artery Disease , 2019 .

[2]  Luis M. Candanedo,et al.  Data driven prediction models of energy use of appliances in a low-energy house , 2017 .

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Kasturi Dewi Varathan,et al.  Identification of significant features and data mining techniques in predicting heart disease , 2019, Telematics Informatics.

[5]  Aurangzeb Khan,et al.  An Automated Diagnostic System for Heart Disease Prediction Based on ${\chi^{2}}$ Statistical Model and Optimally Configured Deep Neural Network , 2019, IEEE Access.

[6]  Gautam Srivastava,et al.  Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques , 2019, IEEE Access.

[7]  Adeeb Noor,et al.  An Intelligent Learning System Based on Random Search Algorithm and Optimized Random Forest Model for Improved Heart Disease Detection , 2019, IEEE Access.

[8]  Dorairaj Prabhakaran,et al.  Cardiovascular Diseases in India Compared With the United States. , 2018, Journal of the American College of Cardiology.

[9]  U. Rajendra Acharya,et al.  Automated characterization of coronary artery disease, myocardial infarction, and congestive heart failure using contourlet and shearlet transforms of electrocardiogram signal , 2017, Knowl. Based Syst..

[10]  Ali Idri,et al.  Reviewing ensemble classification methods in breast cancer , 2019, Comput. Methods Programs Biomed..

[11]  Kazuyuki Murase,et al.  Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease , 2018, Applied Intelligence.

[12]  Qiang Guan,et al.  APPLICATION OF ENSEMBLE ALGORITHM INTEGRATING MULTIPLE CRITERIA FEATURE SELECTION IN CORONARY HEART DISEASE DETECTION , 2017 .

[13]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[14]  Saeid Nahavandi,et al.  Machine learning-based coronary artery disease diagnosis: A comprehensive review , 2019, Comput. Biol. Medicine.

[15]  Pugalendhi GaneshKumar,et al.  Fuzzy integrated Bayesian Dempster-Shafer theory to defend cross-layer heterogeneity attacks in communication network of Smart Grid , 2019, Inf. Sci..

[16]  Quanzheng Li,et al.  Early Diagnosis of Alzheimer's Disease Based on Resting-State Brain Networks and Deep Learning , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  U. Rajendra Acharya,et al.  Model uncertainty quantification for diagnosis of each main coronary artery stenosis , 2020, Soft Comput..

[18]  Durgadevi Velusamy,et al.  Predictive analysis of heat transfer characteristics of nanofluids in helically coiled tube heat exchanger using regression approach , 2020 .

[19]  Jafar Habibi,et al.  Diagnosis of Coronary Artery Disease Using Data Mining Techniques Based on Symptoms and ECG Features , 2012 .

[20]  Chandan Chakraborty,et al.  Fuzzy expert system approach for coronary artery disease screening using clinical parameters , 2012, Knowl. Based Syst..

[21]  Raj Kamal,et al.  A hybrid ensemble for classification in multiclass datasets: An application to oilseed disease dataset , 2016, Comput. Electron. Agric..

[22]  U. Rajendra Acharya,et al.  Application of stacked convolutional and long short-term memory network for accurate identification of CAD ECG signals , 2018, Comput. Biol. Medicine.

[23]  U. Rajendra Acharya,et al.  Characterization of coronary artery disease using flexible analytic wavelet transform applied on ECG signals , 2017, Biomed. Signal Process. Control..

[24]  Ganesh R. Naik,et al.  A new technique for the prediction of heart failure risk driven by hierarchical neighborhood component-based learning and adaptive multi-layer networks , 2020, Future Gener. Comput. Syst..

[25]  U. Rajendra Acharya,et al.  Automated characterization and classification of coronary artery disease and myocardial infarction by decomposition of ECG signals: A comparative study , 2017, Inf. Sci..

[26]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[27]  Saeid Nahavandi,et al.  Non-invasive detection of coronary artery disease in high-risk patients based on the stenosis prediction of separate coronary arteries , 2018, Comput. Methods Programs Biomed..

[28]  U. Rajendra Acharya,et al.  Hybrid particle swarm optimization for rule discovery in the diagnosis of coronary artery disease , 2019, Expert Syst. J. Knowl. Eng..

[29]  Karthikeyan Ramasamy,et al.  A Cross-Layer Trust Evaluation Protocol for Secured Routing in Communication Network of Smart Grid , 2020, IEEE Journal on Selected Areas in Communications.

[30]  Oluwarotimi Williams Samuel,et al.  An integrated decision support system based on ANN and Fuzzy_AHP for heart failure risk prediction , 2017, Expert Syst. Appl..

[31]  U. Rajendra Acharya,et al.  Association between work-related features and coronary artery disease: A heterogeneous hybrid feature selection integrated with balancing approach , 2020, Pattern Recognit. Lett..

[32]  U. Rajendra Acharya,et al.  Automated detection of coronary artery disease using different durations of ECG segments with convolutional neural network , 2017, Knowl. Based Syst..

[33]  Sohrab Zendehboudi,et al.  Decision tree-based diagnosis of coronary artery disease: CART model , 2020, Comput. Methods Programs Biomed..

[34]  Jafar Habibi,et al.  A data mining approach for diagnosis of coronary artery disease , 2013, Comput. Methods Programs Biomed..

[35]  Ram Bilas Pachori,et al.  APPLICATION OF EMPIRICAL MODE DECOMPOSITION–BASED FEATURES FOR ANALYSIS OF NORMAL AND CAD HEART RATE SIGNALS , 2016 .

[36]  Radhakrishnan Nagarajan,et al.  An ensemble predictive modeling framework for breast cancer classification. , 2017, Methods.

[37]  Srinivasan Murali,et al.  Online Obstructive Sleep Apnea Detection on Medical Wearable Sensors , 2018, IEEE Transactions on Biomedical Circuits and Systems.

[38]  Ashok Kumar Dwivedi Performance evaluation of different machine learning techniques for prediction of heart disease , 2016, Neural Computing and Applications.

[39]  Sangeet Srivastava,et al.  A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data , 2016, Journal of Medical Systems.

[40]  Asma Ghandeharioun,et al.  Diagnosis of Coronary Arteries Stenosis Using Data Mining , 2012, Journal of medical signals and sensors.

[41]  U. Rajendra Acharya,et al.  Linear and nonlinear analysis of normal and CAD-affected heart rate signals , 2014, Comput. Methods Programs Biomed..

[42]  Adeeb Noor,et al.  An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure , 2019, IEEE Access.

[43]  U. Rajendra Acharya,et al.  Automated diagnosis of coronary artery disease using tunable-Q wavelet transform applied on heart rate signals , 2015, Knowl. Based Syst..

[44]  U. Rajendra Acharya,et al.  Application of higher-order spectra for the characterization of Coronary artery disease using electrocardiogram signals , 2017, Biomed. Signal Process. Control..

[45]  Roohallah Alizadehsani,et al.  Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm , 2017, Comput. Methods Programs Biomed..

[46]  Jason Weston,et al.  A user's guide to support vector machines. , 2010, Methods in molecular biology.

[47]  D. Prabhakaran,et al.  Cardiovascular Diseases in India: Current Epidemiology and Future Directions , 2016, Circulation.

[48]  Roohallah Alizadehsani,et al.  Diagnosis of Coronary Artery Disease Using Cost-Sensitive Algorithms , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[49]  Jafar Habibi,et al.  Diagnosing Coronary Artery Disease via Data Mining Algorithms by Considering Laboratory and Echocardiography Features , 2013, Research in cardiovascular medicine.

[50]  U. Rajendra Acharya,et al.  Automated diagnosis of Coronary Artery Disease affected patients using LDA, PCA, ICA and Discrete Wavelet Transform , 2013, Knowl. Based Syst..

[51]  Ashish Khanna,et al.  Boosted neural network ensemble classification for lung cancer disease diagnosis , 2019, Appl. Soft Comput..

[52]  Roohallah Alizadehsani,et al.  Exerting Cost-Sensitive and Feature Creation Algorithms for Coronary Artery Disease Diagnosis , 2012, Int. J. Knowl. Discov. Bioinform..

[53]  Ali Dehghantanha,et al.  Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing , 2016, EURASIP Journal on Wireless Communications and Networking.

[54]  Qing-Guo Wang,et al.  XGBoost Model for Chronic Kidney Disease Diagnosis , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[55]  U. Rajendra Acharya,et al.  A new machine learning technique for an accurate diagnosis of coronary artery disease , 2019, Comput. Methods Programs Biomed..

[56]  Jafar Habibi,et al.  Coronary artery disease detection using computational intelligence methods , 2016, Knowl. Based Syst..

[57]  Durgadevi Velusamy,et al.  Water Cycle Algorithm Tuned Fuzzy Expert System for Trusted Routing in Smart Grid Communication Network , 2020, IEEE Transactions on Fuzzy Systems.