Ensemble Learning Classification for Medical Diagnosis

According to a report by the World Health Organization(WHO) of diseases responsible for deaths, heart disease tops the chart. It is revealed that diagnosing heart disease in an earlier stage is an important issue and researchers across the world have investigated to develop intelligent support systems that could help a physician in better medical discernment beforehand. Machine Learning can be used to develop a medical diagnosis system for predicting heart disease which not only promises a more accurate diagnosis but also reduces the cost of diagnosis. In this paper, we are automating the task of predicting heart disease by using statistical methods/machine learning algorithms. We are exploiting some of the most important algorithms used these days like Logistic Regression, Support Vector Machine (SVM), Decision trees and also Ensemble Learning algorithms (Bagging, Boosting and Random Forests). We conduct these experiments on Cleveland Heart Dataset available on UCI KDD Archive. The dataset distribution judged by our research fits well with decision trees and gives better results with ensemble learning methods, we justify this claim by comparing ensemble results with other Machine Learning approaches. Random forest classifier works best for this data set distribution, the random-nature of this ensemble approach fits the dataset well providing good test accuracies of upto 96.26%

[1]  Dongkyoo Shin,et al.  Effective Diagnosis of Heart Disease through Bagging Approach , 2009, 2009 2nd International Conference on Biomedical Engineering and Informatics.

[2]  Ning Chen,et al.  Extending Learning Vector Quantization for Classifying Data with Categorical Values , 2009, ICAART.

[3]  Sunanda Dixit,et al.  Prediction of heart disease using ensemble learning and Particle Swarm Optimization , 2017, 2017 International Conference On Smart Technologies For Smart Nation (SmartTechCon).

[4]  Pat Langley,et al.  Models of Incremental Concept Formation , 1990, Artif. Intell..

[5]  Global, regional, and national incidence, prevalence, and years lived with disability for 301 acute and chronic diseases and injuries in 188 countries, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013 , 2015, The Lancet.

[6]  Abdulkadir Sengür,et al.  Effective diagnosis of heart disease through neural networks ensembles , 2009, Expert Syst. Appl..

[7]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[8]  Lei Lei,et al.  R-C4.5 decision tree model and its applications to health care dataset , 2005, Proceedings of ICSSSM '05. 2005 International Conference on Services Systems and Services Management, 2005..

[9]  Ashutosh Kumar Singh,et al.  Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980–2015: a systematic analysis for the Global Burden of Disease Study 2015 , 2016, The Lancet.

[10]  Sukhpreet Kaur,et al.  Prediction of Heart Disease Based on Risk Factors Using , 2015 .

[11]  B. L. Deekshatulu,et al.  Heart disease prediction using lazy associative classification , 2013, 2013 International Mutli-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s).

[12]  Kathleen H. Miao,et al.  Diagnosing Coronary Heart Disease using Ensemble Machine Learning , 2016 .

[13]  Tülay Karayılan,et al.  Prediction of heart disease using neural network , 2017, 2017 International Conference on Computer Science and Engineering (UBMK).