Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines

Heart is one of the essential operating organs of the human body and its failure is a major contributing factor toward the human deaths. Coronary heart disease may be asymptotic but can be anticipated through the medical tests and daily life routine of the subject. Diagnosis of the coronary heart disease needs a specialized medical resource with the plenty of experience. All over the world and particularly in the developing countries, there is a lack of such experts which make the diagnosis more difficult. In this paper, we present a clinical heart disease diagnostic system by proposing feature subset selection methodology with an object of achieving improved performance. The proposed methodology presents three algorithms for selecting candidate feature subsets: (1) mean Fisher score-based feature selection algorithm, (2) forward feature selection algorithm and (3) reverse feature selection algorithm. Feature subset selection algorithm is presented to select the most decisive subset from the candidate feature subsets. The features are added to the feature subsets on the basis of their individual Fisher scores, while the selection of a feature subset depends on its Matthews correlation coefficient score and dimension. The selected feature subset with the reduced dimension is fed to the RBF kernel-based SVM which results in binary classification: (1) heart disease patient and (2) normal control subject. The proposed methodology is validated through accuracy, specificity and sensitivity using four UCI datasets, i.e., Cleveland, Switzerland, Hungarian and SPECTF. The statistical results achieved using the proposed technique are shown in comparison with the existing techniques reflecting its better performance. It has an accuracy of 81.19, 84.52, 92.68 and 82.7% for Cleveland, Hungarian, Switzerland and SPECTF, respectively.

[1]  Chih-Chou Chiu,et al.  Hybrid intelligent modeling schemes for heart disease classification , 2014, Appl. Soft Comput..

[2]  Mehmet Fatih Akay,et al.  Support vector machines combined with feature selection for breast cancer diagnosis , 2009, Expert Syst. Appl..

[3]  Yi-Ping Phoebe Chen,et al.  Computational intelligence for heart disease diagnosis: A medical knowledge driven approach , 2013, Expert Syst. Appl..

[4]  R. Chitra,et al.  Heart Disease Prediction System Using Intelligent Network , 2015 .

[5]  U. Rajendra Acharya,et al.  Automated diagnosis of Coronary Artery Disease affected patients using LDA, PCA, ICA and Discrete Wavelet Transform , 2013, Knowl. Based Syst..

[6]  A. Govardhan,et al.  Rough-Fuzzy Classifier: A System to Predict the Heart Disease by Blending Two Different Set Theories , 2014 .

[7]  Chun Hui,et al.  Cost-sensitive feature selection in medical data analysis with trace ratio criterion , 2014, 2014 12th International Conference on Signal Processing (ICSP).

[8]  P. K. Anooj,et al.  Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules , 2012, J. King Saud Univ. Comput. Inf. Sci..

[9]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[10]  Chih-Fong Tsai,et al.  SVM and SVM Ensembles in Breast Cancer Prediction , 2017, PloS one.

[11]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12]  Jun Yu,et al.  l2, 1 Norm regularized fisher criterion for optimal feature selection , 2015, Neurocomputing.

[13]  G. Narsimha,et al.  Heart Disease Prediction System Using Data Mining Technique by Fuzzy K-NN Approach , 2015 .

[14]  Imran Khan,et al.  Feature extraction through parallel Probabilistic Principal Component Analysis for heart disease diagnosis , 2017 .

[15]  J. Plange-Rhule,et al.  Shortage of healthcare workers in developing countries--Africa. , 2009, Ethnicity & disease.

[16]  Padmakumari K. N. Anooj,et al.  Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules and decision tree rules , 2011, Central European Journal of Computer Science.

[17]  S. Muthukaruppan,et al.  A hybrid particle swarm optimization based fuzzy expert system for the diagnosis of coronary artery disease , 2012, Expert Syst. Appl..

[18]  Ashok Ghatol,et al.  Feature selection for medical diagnosis : Evaluation for cardiovascular diseases , 2013, Expert Syst. Appl..

[19]  P. K. Anooj,et al.  Clinical Decision Support System: Risk Level Prediction of Heart Disease Using Decision Tree Fuzzy Rules , 2022 .

[20]  Dimitrios I. Fotiadis,et al.  Automated Diagnosis of Coronary Artery Disease Based on Data Mining and Fuzzy Modeling , 2008, IEEE Transactions on Information Technology in Biomedicine.

[21]  Gary Geunbae Lee,et al.  Information gain and divergence-based feature selection for machine learning-based text categorization , 2006, Inf. Process. Manag..

[22]  P. Shekelle,et al.  Systematic Review: Impact of Health Information Technology on Quality, Efficiency, and Costs of Medical Care , 2006, Annals of Internal Medicine.

[23]  Jae-Kwon Kim,et al.  Coronary heart disease optimization system on adaptive-network-based fuzzy inference system and linear discriminant analysis (ANFIS–LDA) , 2013, Personal and Ubiquitous Computing.

[24]  Mengjie Zhang,et al.  A binary ABC algorithm based on advanced similarity scheme for feature selection , 2015, Appl. Soft Comput..

[25]  Phayung Meesad,et al.  A highly accurate firefly based algorithm for heart disease prediction , 2015, Expert Syst. Appl..

[26]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[27]  N. Yumusak,et al.  A new method of automatic recognition for tuberculosis disease diagnosis using support vector machines , 2017 .

[28]  K. AnoojP.,et al.  Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules , 2012, J. King Saud Univ. Comput. Inf. Sci..

[29]  A. Rajkumar,et al.  Diagnosis Of Heart Disease Using Datamining Algorithm , 2010 .

[30]  Ali Miri,et al.  Using the Extreme Learning Machine (ELM) technique for heart disease diagnosis , 2015, 2015 IEEE Canada International Humanitarian Technology Conference (IHTC2015).

[31]  Emre Çomak,et al.  A decision support system based on support vector machines for diagnosis of the heart valve diseases , 2007, Comput. Biol. Medicine.