Hybrid model for prediction of heart disease

Heart disease is a leading cause of death in the world. In order to drop its rate, effective and timely diagnosis of the disease is very essential. Numerous automated decision support systems have been developed for this purpose. In the present research, a predictive model consisting of two-level optimization is introduced, to save lives and cost via effective diagnosis of the disease. Level-1 optimization of the model first identifies parallelly an optimal proportion ( P opt ) for training and test sets for each dataset on parallel machine. Next, the best training set ( T best ) for P opt is again searched parallelly. On the other hand, level -2 optimization refines the rule set ( R ) generated by the Perfect Rule Induction by Sequential Method (PRISM) learner on T best employing parallel genetic algorithm. The experimental results obtained by the model over the heart disease datasets (collected from https://archive.ics.uci.edu/ml ) are compared and analysed with its base learner and four state-of-the-art learners, namely C4.5 (decision tree-based classifier), Naïve Bayes, neural network and support vector machine. The empirical outcomes (based on the top performance metrics—prediction accuracy, precision, recall, area under curve values, true positive and false positive rates) positively demonstrate that the new model is proficient in undertaking heart disease treatment. Importantly, the prediction accuracy of the presented hybrid model exceeds around 6% than that of the sequential GA-based hybrid model over almost all the chosen datasets. After all, the proposed system may work as an e - doctor to predict heart attack and assist clinicians to take precautionary steps.

[1]  María José del Jesús,et al.  A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets , 2008, Fuzzy Sets Syst..

[2]  Xiaoyong Liu,et al.  PSO-Based Support Vector Machine with Cuckoo Search Technique for Clinical Disease Diagnoses , 2014, TheScientificWorldJournal.

[3]  Richa Sharma,et al.  Efficient Heart Disease Prediction System , 2016 .

[4]  Roohallah Alizadehsani,et al.  Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm , 2017, Comput. Methods Programs Biomed..

[5]  Abdulkadir Sengür,et al.  Evaluation of ensemble methods for diagnosing of valvular heart disease , 2010, Expert Syst. Appl..

[6]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[7]  Ahmed Patel,et al.  Empirical rapid and accurate prediction model for data mining tasks in cloud computing environments , 2014, 2014 International Congress on Technology, Communication and Knowledge (ICTCK).

[8]  Jimeng Sun,et al.  Using recurrent neural network models for early detection of heart failure onset , 2016, J. Am. Medical Informatics Assoc..

[9]  Bikash Kanti Sarkar,et al.  A case study on partitioning data for classification , 2016, Int. J. Inf. Decis. Sci..

[10]  Jan M. Zytkow,et al.  Handbook of Data Mining and Knowledge Discovery , 2002 .

[11]  Jason Catlett,et al.  On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[12]  Hidayet Takçı,et al.  Improvement of heart attack prediction by the feature selection methods , 2018, Turkish J. Electr. Eng. Comput. Sci..

[13]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Yi-Ping Phoebe Chen,et al.  Computational intelligence for heart disease diagnosis: A medical knowledge driven approach , 2013, Expert Syst. Appl..

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  María José del Jesús,et al.  Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets , 2009, Int. J. Approx. Reason..

[18]  Wei-Pang Yang,et al.  A discretization algorithm based on Class-Attribute Contingency Coefficient , 2008, Inf. Sci..

[19]  Arif Gülten,et al.  Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms , 2011, Comput. Methods Programs Biomed..

[20]  Bulusu Lakshmana Deekshatulu,et al.  Classification of Heart Disease Using K- Nearest Neighbor and Genetic Algorithm , 2015, ArXiv.

[21]  Richard H. Brown,et al.  The Solution of a Certain Two-Person Zero-Sum Game , 1957 .

[22]  Bernhard Pfahringer,et al.  Compression-Based Discretization of Continuous Attributes , 1995, ICML.

[23]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[24]  Samaher Al-Janabi,et al.  Evaluation prediction techniques to achievement an optimal biomedical analysis , 2019 .

[25]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[26]  Novruz Allahverdi,et al.  Extracting rules for classification problems: AIS based approach , 2009, Expert Syst. Appl..

[27]  Samaher Al-Janabi,et al.  Survey of main challenges (security and privacy) in wireless body area networks for healthcare applications , 2017 .

[28]  M. J. Quinn,et al.  Parallel Computing: Theory and Practice , 1994 .

[29]  Xiaopeng Wei,et al.  Predicting the Risk of Heart Failure With EHR Sequential Data Modeling , 2018, IEEE Access.

[30]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[31]  Pasi Luukka,et al.  Feature selection using fuzzy entropy measures with similarity classifier , 2011, Expert Syst. Appl..

[32]  Michael W. Berry,et al.  Lecture Notes in Data Mining , 2006 .

[33]  K. S. Chaudhuri,et al.  Selecting informative rules with parallel genetic algorithm in classification problem , 2011, Appl. Math. Comput..

[34]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[35]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[36]  Z. Pawlak,et al.  Rough set approach to multi-attribute decision analysis , 1994 .

[37]  Samaher Al-Janabi,et al.  A Hybrid Image steganography method based on genetic algorithm , 2016, 2016 7th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT).

[38]  S. Vijayarani,et al.  An Efficient Algorithm for Generating Classification Rules , 2011 .

[39]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[40]  Enrique Alba,et al.  Analyzing synchronous and asynchronous parallel distributed genetic algorithms , 2001, Future Gener. Comput. Syst..

[41]  Se-Hak Chun,et al.  Cost-sensitive case-based reasoning using a genetic algorithm: Application to medical diagnosis , 2011, Artif. Intell. Medicine.

[42]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[43]  Sun I. Kim,et al.  Nonlinear Support Vector Machine Visualization for Risk Factor Analysis Using Nomograms and Localized Radial Basis Function Kernels , 2008, IEEE Transactions on Information Technology in Biomedicine.

[44]  Samaher Hussein Ali,et al.  A novel tool (FP-KC) for handle the three main dimensions reduction and association rule mining , 2012, 2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT).

[45]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[46]  Enrique Alba,et al.  Improving flexibility and efficiency by adding parallelism to genetic algorithms , 2002, Stat. Comput..

[47]  Daniel J. Simon,et al.  Evolutionary optimization algorithms : biologically-Inspired and population-based approaches to computer intelligence , 2013 .

[48]  Abdulkadir Sengür,et al.  Effective diagnosis of heart disease through neural networks ensembles , 2009, Expert Syst. Appl..

[49]  Euripidis N. Loukis,et al.  Using decision tree algorithms as a basis for a heart sound diagnosis decision support system , 2003, 4th International IEEE EMBS Special Topic Conference on Information Technology Applications in Biomedicine, 2003..

[50]  Omar H. Karam,et al.  Feature Analysis of Coronary Artery Heart Disease Data Sets , 2015 .

[51]  Lukasz A. Kurgan,et al.  CAIM discretization algorithm , 2004, IEEE Transactions on Knowledge and Data Engineering.

[52]  S. Nikolaiev,et al.  Reinvention of the cardiovascular diseases prevention and prediction due to ubiquitous convergence of mobile apps and machine learning , 2015, 2015 Information Technologies in Innovation Business Conference (ITIB).

[53]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[54]  LiMin Fu,et al.  Knowledge discovery based on neural networks , 1999, Commun. ACM.