Coronary Heart Disease Diagnosis Through Self-Organizing Map and Fuzzy Support Vector Machine with Incremental Updates

The trade-off between computation time and predictive accuracy is important in the design and implementation of clinical decision support systems. Machine learning techniques with incremental updates have proven its usefulness in analyzing large collection of medical datasets for diseases diagnosis. This research aims to develop a predictive method for heart disease diagnosis using machine learning techniques. To this end, the proposed method is developed by unsupervised and supervised learning techniques. In particular, this research relies on Principal Component Analysis (PCA), Self-Organizing Map, Fuzzy Support Vector Machine (Fuzzy SVM), and two imputation techniques for missing value imputation. Furthermore, we apply the incremental PCA and FSVM for incremental learning of the data to reduce the computation time of disease prediction. Our data analysis on two real-world datasets, Cleveland and Statlog, showed that the use of incremental Fuzzy SVM can significantly improve the accuracy of heart disease classification. The experimental results further revealed that the method is effective in reducing the computation time of disease diagnosis in relation to the non-incremental learning technique.

[1]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[2]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[3]  A. Maćkiewicz,et al.  Principal Components Analysis (PCA) , 1993 .

[4]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[5]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[6]  Mehrbakhsh Nilashi,et al.  Measuring sustainability through ecological sustainability and human sustainability: A machine learning approach , 2019 .

[7]  Yi-Ping Phoebe Chen,et al.  Association rule mining to detect factors which contribute to heart disease in males and females , 2013, Expert Syst. Appl..

[8]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[9]  Mehrbakhsh Nilashi,et al.  A hybrid intelligent system for the prediction of Parkinson's Disease progression using machine learning techniques , 2017 .

[10]  Kapil Wankhade,et al.  Decision support system for heart disease based on support vector machine and Artificial Neural Network , 2010, 2010 International Conference on Computer and Communication Technology (ICCCT).

[11]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[12]  Abdulkadir Sengür,et al.  Effective diagnosis of heart disease through neural networks ensembles , 2009, Expert Syst. Appl..

[13]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[14]  Roderick J A Little,et al.  A Review of Hot Deck Imputation for Survey Non‐response , 2010, International statistical review = Revue internationale de statistique.

[15]  Payman Moallem,et al.  Automatic Detection of Malignant Melanoma using Macroscopic Images , 2014, Journal of medical signals and sensors.

[16]  Yifan Sun,et al.  Application of decision making and fuzzy sets theory to evaluate the healthcare and medical problems: A review of three decades of research with recent developments , 2019, Expert Syst. Appl..

[17]  Ernestina Menasalvas Ruiz,et al.  Supervoxels-Based Histon as a New Alzheimer’s Disease Imaging Biomarker , 2018, Sensors.

[18]  Biswajeet Pradhan,et al.  Modeling landslide susceptibility in data-scarce environments using optimized data mining and statistical methods , 2018 .

[19]  H. Abdi,et al.  Principal component analysis , 2010 .

[20]  E. Yadegaridehkordi,et al.  Revealing customers’ satisfaction and preferences through online review analysis: The case of Canary Islands hotels , 2019, Journal of Retailing and Consumer Services.

[21]  AbdiHervé,et al.  Principal Component Analysis , 2010, Essentials of Pattern Recognition.

[22]  Hong Gu,et al.  Predicting lysine phosphoglycerylation with fuzzy SVM by incorporating k-spaced amino acid pairs into Chou׳s general PseAAC. , 2016, Journal of theoretical biology.

[23]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Elif Derya Übeyli,et al.  Automatic Detection of Erythemato-Squamous Diseases Using k-Means Clustering , 2010, Journal of Medical Systems.

[25]  Ashok Ghatol,et al.  Feature selection for medical diagnosis : Evaluation for cardiovascular diseases , 2013, Expert Syst. Appl..

[26]  Omar H. Karam,et al.  Feature Analysis of Coronary Artery Heart Disease Data Sets , 2015 .

[27]  Dae-Ki Kang,et al.  Optimizing SVM Ensembles Using Genetic Algorithms in Bankruptcy Prediction , 2010, J. Inform. and Commun. Convergence Engineering.

[28]  Mehrbakhsh Nilashi,et al.  Factors influencing medical tourism adoption in Malaysia: A DEMATEL-Fuzzy TOPSIS approach , 2019, Comput. Ind. Eng..

[29]  Wenjian Wang,et al.  Online prediction model based on support vector machine , 2008, Neurocomputing.

[30]  D. Chen,et al.  Breast cancer diagnosis using self-organizing map for sonography. , 2000, Ultrasound in medicine & biology.

[31]  M. Elgamal,et al.  The relation between hepatitis C virus and coronary heart disease. , 2014, Medical hypotheses.

[32]  B. Norrving,et al.  Global atlas on cardiovascular disease prevention and control. , 2011 .

[33]  M. Hariharan,et al.  A new hybrid intelligent system for accurate detection of Parkinson's disease , 2014, Comput. Methods Programs Biomed..

[34]  Sheeraz Akram,et al.  Heart disease classification ensemble optimization using Genetic algorithm , 2011, 2011 IEEE 14th International Multitopic Conference.

[35]  Mehrbakhsh Nilashi,et al.  A knowledge-based system for breast cancer classification using fuzzy logic method , 2017, Telematics Informatics.

[36]  K. R. Al-Balushi,et al.  Artificial neural networks and support vector machines with genetic algorithm for bearing fault detection , 2003 .

[37]  R. Haynes,et al.  Effects of Computer-based Clinical Decision Support Systems on Clinician Performance and Patient Outcome: A Critical Appraisal of Research , 1994, Annals of Internal Medicine.

[38]  Teresa A. Myers Goodbye, Listwise Deletion: Presenting Hot Deck Imputation as an Easy and Effective Tool for Handling Missing Data , 2011 .

[39]  David West,et al.  A comparison of SOM neural network and hierarchical clustering methods , 1996 .

[40]  Mehrbakhsh Nilashi,et al.  Accuracy Improvement for Predicting Parkinson’s Disease Progression , 2016, Scientific Reports.

[41]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[42]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[43]  Neda Ahmadi,et al.  An intelligent method for iris recognition using supervised machine learning techniques , 2019 .

[44]  Harichandran Khanna Nehemiah,et al.  Knowledge Mining from Clinical Datasets Using Rough Sets and Backpropagation Neural Network , 2015, Comput. Math. Methods Medicine.

[45]  Liyana Shuib,et al.  The impact of big data on firm performance in hotel industry , 2020, Electron. Commer. Res. Appl..

[46]  Abdeltawab M. Hendawi,et al.  Heart disease identification from patients' social posts, machine learning solution on Spark , 2020, Future Gener. Comput. Syst..

[47]  Novruz Allahverdi,et al.  Design of a hybrid system for the diabetes and heart diseases , 2008, Expert Syst. Appl..

[48]  J. Leal,et al.  UK research expenditure on dementia, heart disease, stroke and cancer: are levels of spending related to disease burden? , 2012, European journal of neurology.

[49]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[50]  Mehrbakhsh Nilashi,et al.  An analytical method for diseases prediction using machine learning techniques , 2017, Comput. Chem. Eng..

[51]  Kimmo Kiviluoto,et al.  Predicting bankruptcies with the self-organizing map , 1998, Neurocomputing.

[52]  Markus Ringnér,et al.  What is principal component analysis? , 2008, Nature Biotechnology.

[53]  Mehrbakhsh Nilashi,et al.  A Soft Computing Method for Mesothelioma Disease Classification , 2017 .

[54]  Nigel Stallard,et al.  The changing face of cardiovascular disease 2000-2012: An analysis of the world health organisation global health estimates data. , 2016, International journal of cardiology.

[55]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[56]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[57]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[58]  T. Santhanam,et al.  Application of K-Means and Genetic Algorithms for Dimension Reduction by Integrating SVM for Diabetes Diagnosis , 2015 .

[59]  Hong Shen,et al.  Application of online-training SVMs for real-time intrusion detection with different considerations , 2005, Comput. Commun..

[60]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[61]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[62]  Chih-Chou Chiu,et al.  Hybrid intelligent modeling schemes for heart disease classification , 2014, Appl. Soft Comput..

[63]  Francisco Jesús Martínez-Murcia,et al.  LVQ-SVM based CAD tool applied to structural MRI for the diagnosis of the Alzheimer's disease , 2013, Pattern Recognit. Lett..

[64]  Sang Won Yoon,et al.  Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms , 2014, Expert Syst. Appl..

[65]  H. Fowler,et al.  Sensitivity of extreme rainfall to temperature in semi-arid Mediterranean regions , 2019, Atmospheric Research.

[66]  T. Kohonen Analysis of a simple self-organizing process , 1982, Biological Cybernetics.

[67]  Kazuyuki Murase,et al.  Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease , 2018, Applied Intelligence.

[68]  Ohbyung Kwon,et al.  Missing Values and Optimal Selection of an Imputation Method and Classification Algorithm to Improve the Accuracy of Ubiquitous Computing Applications , 2015 .

[69]  P. K. Anooj,et al.  Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules , 2012, J. King Saud Univ. Comput. Inf. Sci..

[70]  Mehrbakhsh Nilashi,et al.  An analytical method for measuring the Parkinson’s disease progression: A case on a Parkinson’s telemonitoring dataset , 2019, Measurement.

[71]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[72]  Kindie Biredagn Nahato,et al.  Hybrid approach using fuzzy sets and extreme learning machine for classifying clinical datasets , 2016 .

[73]  Eta S. Berner,et al.  Clinical Decision Support Systems , 1999, Health Informatics.

[74]  Ana M. Aguilera,et al.  Using principal components for estimating logistic regression with high-dimensional multicollinear data , 2006, Comput. Stat. Data Anal..

[75]  Mehrbakhsh Nilashi,et al.  Accuracy Improvement for Diabetes Disease Classification: A Case on a Public Medical Dataset , 2017 .

[76]  A. Mechelli,et al.  Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: A critical review , 2012, Neuroscience & Biobehavioral Reviews.