Comparative Analysis of Classification Approaches for Heart Disease Prediction

Heart disease is one of the most common causes of death around the world nowadays. Often, the enormous amount of information is gathered to detect diseases in medical science. All of the information is not useful but vital in taking the correct decision. Thus, it is not always easy to detect the heart disease because it requires skilled knowledge or experiences about heart failure symptoms for an early prediction. Most of the medical dataset are dispersed, widespread and assorted. However, data mining is a robust technique for extracting invisible, predictive and actionable information from the extensive databases. In this paper, by using info gain feature selection technique and removing unnecessary features, different classification techniques such that KNN, Decision Tree (ID3), Gaussian Naïve Bayes, Logistic Regression and Random Forest are used on heart disease dataset for better prediction. Different performance measurement factors such as accuracy, ROC curve, precision, recall, sensitivity, specificity, and F1-score are considered to determine the performance of the classification techniques. Among them, Logistic Regression performed better, and the classification accuracy is 92.76%.

[1]  Nisar Hundewale,et al.  Comparison of classification techniques-SVM and naives bayes to predict the Arboviral disease-Dengue , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[2]  Hamid R. Arabnia,et al.  A comprehensive investigation and comparison of Machine Learning Techniques in the domain of heart disease , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[3]  Mohammad Shorif Uddin,et al.  Analysis of data mining techniques for heart disease prediction , 2016, 2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT).

[4]  Constantinos S. Pattichis,et al.  Assessment of the Risk Factors of Coronary Heart Events Based on Data Mining With Decision Trees , 2010, IEEE Transactions on Information Technology in Biomedicine.

[5]  Mehedi Hasan,et al.  Educational data mining: A mining model for developing students' programming skills , 2014, The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014).

[6]  Rizwan Beg,et al.  Genetic neural network based data mining in prediction of heart disease using risk factors , 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES.

[7]  Peter C Austin,et al.  Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. , 2013, Journal of clinical epidemiology.

[8]  Veera Boonjing,et al.  Comparing performances of logistic regression, decision trees, and neural networks for classifying heart disease patients , 2010, 2010 International Conference on Computer Information Systems and Industrial Management Applications (CISIM).

[9]  Ms. Ishtake " Intelligent Heart Disease Prediction System Using Data Mining Techniques " , .

[10]  Junfeng Hu,et al.  Cardiovascular risk prediction method based on CFS subset evaluation and random forest classification framework , 2017, 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)(.

[11]  Priti Chandra,et al.  Computational intelligence technique for early diagnosis of heart disease , 2015, 2015 IEEE International Conference on Engineering and Technology (ICETECH).

[12]  Richard Segal,et al.  Risk prediction model for in‐hospital mortality in women with ST‐elevation myocardial infarction: A machine learning approach , 2017, Heart & lung : the journal of critical care.