Feature selection for medical diagnosis: Evaluation for using a hybrid Stacked-Genetic approach in the diagnosis of heart disease

Background and purpose: Heart disease has been one of the most important causes of death in the last 10 years, so the use of classification methods to diagnose and predict heart disease is very important. If this disease is predicted before menstruation, it is possible to prevent high mortality of the disease and provide more accurate and efficient treatment methods. Materials and Methods: Due to the selection of input features, the use of basic algorithms can be very timeconsuming. Reducing dimensions or choosing a good subset of features, without risking accuracy, has great importance for basic algorithms for successful use in the region. In this paper, we propose an ensemble-genetic learning method using wrapper feature reduction to select features in disease classification. Findings: The development of a medical diagnosis system based on ensemble learning to predict heart disease provides a more accurate diagnosis than the traditional method and reduces the cost of treatment. Conclusion: The results showed that Thallium Scan and vascular occlusion were the most important features in the diagnosis of heart disease and can distinguish between sick and healthy people with 97.57% accuracy.

[1]  Ms. Ishtake " Intelligent Heart Disease Prediction System Using Data Mining Techniques " , .

[2]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[3]  Nikola Bogunovic,et al.  A review of feature selection methods with applications , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[4]  Kapil Wankhade,et al.  Decision support system for heart disease based on support vector machine and Artificial Neural Network , 2010, 2010 International Conference on Computer and Communication Technology (ICCCT).

[5]  Massimo Buscema,et al.  Training with Input Selection and Testing (TWIST) Algorithm: A Significant Advance in Pattern Recognition Performance of Machine Learning , 2013 .

[6]  B. Edmonds Using Localised ‘Gossip’ to Structure Distributed Learning , 2005 .

[7]  Qiang Guan,et al.  APPLICATION OF ENSEMBLE ALGORITHM INTEGRATING MULTIPLE CRITERIA FEATURE SELECTION IN CORONARY HEART DISEASE DETECTION , 2017 .

[8]  Qasem Al-Tashi,et al.  Feature Selection Method Based on Grey Wolf Optimization for Coronary Artery Disease Classification , 2018, Advances in Intelligent Systems and Computing.

[9]  Imran Khan,et al.  Feature extraction through parallel Probabilistic Principal Component Analysis for heart disease diagnosis , 2017 .

[10]  Kemal Polat,et al.  The Medical Applications of Attribute Weighted Artificial Immune System (AWAIS): Diagnosis of Heart and Diabetes Diseases , 2005, ICARIS.

[11]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[12]  Crystal L. Park,et al.  Control-coping goodness-of-fit and chronic illness: a systematic review of the literature , 2018, Health psychology review.

[13]  Frantisek Babic,et al.  Predictive and descriptive analysis for heart disease diagnosis , 2017, 2017 Federated Conference on Computer Science and Information Systems (FedCSIS).

[14]  Pradeep Singh,et al.  A Stacked Generalization Approach for Diagnosis and Prediction of Type 2 Diabetes Mellitus , 2019, Advances in Intelligent Systems and Computing.

[15]  Srishti Arora,et al.  Decision Tree Algorithms for Prediction of Heart Disease , 2018, Information and Communication Technology for Competitive Strategies.

[16]  Verónica Bolón-Canedo,et al.  A review of feature selection methods in medical applications , 2019, Comput. Biol. Medicine.

[17]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[18]  Yorgos Goletsis,et al.  Estimation of New York Heart Association class in heart failure patients based on machine learning techniques , 2017, 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[19]  Tarek Helmy,et al.  Multi-category bioinformatics dataset classification using extreme learning machine , 2009, 2009 IEEE Congress on Evolutionary Computation.

[20]  Tülay Karayılan,et al.  Prediction of heart disease using neural network , 2017, 2017 International Conference on Computer Science and Engineering (UBMK).

[21]  Foram P. Shah,et al.  A review on feature selection and feature extraction for text classification , 2016, 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET).

[22]  Saad Zafar,et al.  Machine learning based decision support systems (DSS) for heart disease diagnosis: a review , 2017, Artificial Intelligence Review.

[23]  Divya Tomar,et al.  Feature Selection based Least Square Twin Support Vector Machine for Diagnosis of Heart Disease , 2014, BSBT 2014.

[24]  Mary Walowe Mwadulo,et al.  A Review on Feature Selection Methods For Classification Tasks , 2016 .

[25]  Babak Nouri-Moghaddam,et al.  A novel filter-wrapper hybrid gene selection approach for microarray data based on multi-objective forest optimization algorithm , 2020 .

[26]  Kasturi Dewi Varathan,et al.  Identification of significant features and data mining techniques in predicting heart disease , 2019, Telematics Informatics.

[27]  Gautam Srivastava,et al.  Hybrid genetic algorithm and a fuzzy logic classifier for heart disease diagnosis , 2019, Evolutionary Intelligence.

[28]  Asha Gowda Karegowda,et al.  Feature Subset Selection Problem using Wrapper Approach in Supervised Learning , 2010 .

[29]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[30]  E. Kannan,et al.  An efficient framework for heart disease classification using feature extraction and feature selection technique in data mining , 2016, 2016 International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS).

[31]  Dimitrios I. Fotiadis,et al.  Heart Failure: Diagnosis, Severity Estimation and Prediction of Adverse Events Through Machine Learning Techniques , 2016, Computational and structural biotechnology journal.

[32]  Pradeep Singh,et al.  Stacking-based multi-objective evolutionary ensemble framework for prediction of diabetes mellitus , 2020 .

[33]  M. Gardaneh,et al.  Accurate Detection of Breast Cancer Metastasis Using a Hybrid Model of Artificial Intelligence Algorithm , 2020, Archives of Breast Cancer.

[34]  Tejaswini U. Mane Smart heart disease prediction system using Improved K-means and ID3 on big data , 2017, 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI).

[35]  Novruz Allahverdi,et al.  Design of a hybrid system for the diabetes and heart diseases , 2008, Expert Syst. Appl..

[36]  Babak Nouri-Moghaddam,et al.  A novel multi-objective forest optimization algorithm for wrapper feature selection , 2021, Expert Syst. Appl..

[37]  M. A. H. Akhand,et al.  Genetic algorithm based fuzzy decision support system for the diagnosis of heart disease , 2016, 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV).

[38]  Kuang-Ming Kuo,et al.  A multi-class classification model for supporting the diagnosis of type II diabetes mellitus , 2020, PeerJ.

[39]  S. Martin Coping With Chronic Illness , 1995, Home healthcare nurse.

[40]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[41]  J. Jebamalar Tamilselvi,et al.  A Review of Feature Selection Algorithms for Data Mining Techniques , 2015 .

[42]  Kemal Polat,et al.  A new feature selection method on classification of medical datasets: Kernel F-score feature selection , 2009, Expert Syst. Appl..

[43]  Wlodzislaw Duch,et al.  A new methodology of extraction, optimization and application of crisp and fuzzy logical rules , 2001, IEEE Trans. Neural Networks.

[44]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[45]  Mehdi Effat Parvar,et al.  Improving Diabetes Diagnosis in Smart Health Using Genetic-based Ensemble learning algorithm Approach to IoT Infrastructure , 2019 .

[46]  Emanuele Frontoni,et al.  Early temporal prediction of Type 2 Diabetes Risk Condition from a General Practitioner Electronic Health Record: A Multiple Instance Boosting Approach , 2020, Artif. Intell. Medicine.

[47]  Kathiravan Srinivasan,et al.  Realizing a Stacking Generalization Model to Improve the Prediction Accuracy of Major Depressive Disorder in Adults , 2020, IEEE Access.

[48]  Saba Bashir,et al.  Improving Heart Disease Prediction Using Feature Selection Approaches , 2019, 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST).

[49]  Gautam Srivastava,et al.  Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques , 2019, IEEE Access.

[50]  Hélio Pedrini,et al.  Data feature selection based on Artificial Bee Colony algorithm , 2013, EURASIP J. Image Video Process..

[51]  Syed Muhammad Anwar,et al.  A statistical analysis based recommender model for heart disease patients , 2017, Int. J. Medical Informatics.

[52]  Fulong Chen,et al.  Coupling a Fast Fourier Transformation With a Machine Learning Ensemble Model to Support Recommendations for Heart Disease Patients in a Telehealth Environment , 2017, IEEE Access.

[53]  Hamid R. Arabnia,et al.  A comprehensive investigation and comparison of Machine Learning Techniques in the domain of heart disease , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[54]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[55]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..