An analytical method for diseases prediction using machine learning techniques

Abstract The use of medical datasets has attracted the attention of researchers worldwide. Data mining techniques have been widely used in developing decision support systems for diseases prediction through a set of medical datasets. In this paper, we propose a new knowledge-based system for diseases prediction using clustering, noise removal, and prediction techniques. We use Classification and Regression Trees (CART) to generate the fuzzy rules to be used in the knowledge-based system. We test our proposed method on several public medical datasets. Results on Pima Indian Diabetes, Mesothelioma, WDBC, StatLog, Cleveland and Parkinson’s telemonitoring datasets show that proposed method remarkably improves the diseases prediction accuracy. The results showed that the combination of fuzzy rule-based, CART with noise removal and clustering techniques can be effective in diseases prediction from real-world medical datasets. The knowledge-based system can assist medical practitioners in the healthcare practice as a clinical analytical method.

[1]  P. K. Anooj,et al.  Clinical Decision Support System: Risk Level Prediction of Heart Disease Using Decision Tree Fuzzy Rules , 2022 .

[2]  Nilanjan Dey,et al.  Systematic Analysis of Applied Data Mining Based Optimization Algorithms in Clinical Attribute Extraction and Classification for Diagnosis of Cardiac Patients , 2016, Applications of Intelligent Optimization in Biology and Medicine.

[3]  Esin Dogantekin,et al.  An automatic diabetes diagnosis system based on LDA-Wavelet Support Vector Machine Classifier , 2011, Expert Syst. Appl..

[4]  Elif Derya íbeyli Implementing automated diagnostic systems for breast cancer detection , 2007 .

[5]  Mehrbakhsh Nilashi,et al.  A Soft Computing Method for Mesothelioma Disease Classification , 2017 .

[6]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[7]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[8]  Chien-Hsing Chen,et al.  A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection , 2014, Appl. Soft Comput..

[9]  Mohammad Saniee Abadeh,et al.  A fuzzy classification system based on Ant Colony Optimization for diabetes disease diagnosis , 2011, Expert Syst. Appl..

[10]  Carlos J. Perez,et al.  Addressing voice recording replications for Parkinson's disease detection , 2016, Expert Syst. Appl..

[11]  Lalit Mohan Saini,et al.  Peak load forecasting using Bayesian regularization, Resilient and adaptive backpropagation learning based artificial neural networks , 2008 .

[12]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[13]  Ashok Ghatol,et al.  Feature selection for medical diagnosis : Evaluation for cardiovascular diseases , 2013, Expert Syst. Appl..

[14]  Mehrbakhsh Nilashi,et al.  Multi-criteria collaborative filtering with high accuracy using higher order singular value decomposition and Neuro-Fuzzy system , 2014, Knowl. Based Syst..

[15]  Kapil Wankhade,et al.  Decision support system for heart disease based on support vector machine and Artificial Neural Network , 2010, 2010 International Conference on Computer and Communication Technology (ICCCT).

[16]  Gang Wang,et al.  An efficient diagnosis system for detection of Parkinson's disease using fuzzy k-nearest neighbor approach , 2013, Expert Syst. Appl..

[17]  Abdulkadir Sengür,et al.  Effective diagnosis of heart disease through neural networks ensembles , 2009, Expert Syst. Appl..

[18]  Cemal Hanilçi,et al.  A comparison of regression methods for remote tracking of Parkinson's disease progression , 2012, Expert Syst. Appl..

[19]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[20]  Mehrbakhsh Nilashi,et al.  Hybrid recommendation approaches for multi-criteria collaborative filtering , 2014, Expert Syst. Appl..

[21]  Asoke K. Nandi,et al.  Feature generation using genetic programming with comparative partner selection for diabetes classification , 2013, Expert Syst. Appl..

[22]  Mahmut Ozer,et al.  Impact of small-world network topology on the conventional artificial neural network for the diagnosis of diabetes , 2016 .

[23]  Shradhanjali Rout Fuzzy Petri Net Application: Heart Disease Diagnosis , 2012 .

[24]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[25]  Ching-Hwang Wang,et al.  A new interactive model for improving the learning performance of back propagation neural network , 2007 .

[26]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[27]  Mahinder Pal Singh Bhatia,et al.  SVM classification to distinguish Parkinson disease patients , 2010, A2CWiC '10.

[28]  Freddie Åström,et al.  A parallel neural network approach to prediction of Parkinson's Disease , 2011, Expert Syst. Appl..

[29]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[30]  Akin Özçift,et al.  SVM Feature Selection Based Rotation Forest Ensemble Classifiers to Improve Computer-Aided Diagnosis of Parkinson Disease , 2011, Journal of Medical Systems.

[31]  Kemal Polat,et al.  Classification of Parkinson's disease using feature weighting method on the basis of fuzzy C-means clustering , 2012, Int. J. Syst. Sci..

[32]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[33]  Sundaram Suresh,et al.  Parkinson's disease prediction using gene expression - A projection based learning meta-cognitive neural classifier approach , 2013, Expert Syst. Appl..

[34]  Kemal Polat,et al.  A cascade learning system for classification of diabetes disease: Generalized Discriminant Analysis and Least Square Support Vector Machine , 2008, Expert Syst. Appl..

[35]  Hans Hellendoorn,et al.  Defuzzification in Fuzzy Controllers , 1993, J. Intell. Fuzzy Syst..

[36]  E. P. Ephzibah Cost Effective Approach on Feature Selection using Genetic Algorithms and LS-SVM Classifier , 2010 .

[37]  G. Pillai,et al.  SVM Based Decision Support System for Heart Disease Classification with Integer-Coded Genetic Algorithm to Select Critical Features , 2009 .

[38]  Olusegun Folorunso,et al.  A fuzzy expert system to Trust-Based Access Control in crowdsourcing environments , 2015 .

[39]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[40]  Carlos Ordonez,et al.  FREM: fast and robust EM clustering for large data sets , 2002, CIKM '02.

[41]  Ashkan Sami,et al.  A Multiple-Classifier Framework for Parkinson's Disease Detection Based on Various Vocal Tests , 2016, International journal of telemedicine and applications.

[42]  Saeid Nahavandi,et al.  Medical data classification using interval type-2 fuzzy logic system and wavelets , 2015, Appl. Soft Comput..

[43]  M. Punithavalli,et al.  An Analytical Study on Behavior of Clusters Using K Means, EM and K* Means Algorithm , 2010, ArXiv.

[44]  Der-Chiang Li,et al.  A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets , 2011, Artif. Intell. Medicine.

[45]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[46]  Yoichi Hayashi,et al.  Rule extraction using Recursive-Rule extraction algorithm with J48graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset , 2016 .

[47]  Norhayati Zakuan,et al.  A New Method for Collaborative Filtering Recommender Systems: The Case of Yahoo! Movies and TripAdvisor Datasets , 2016 .

[48]  Aruna Tiwari,et al.  Breast cancer diagnosis using Genetically Optimized Neural Network model , 2015, Expert Syst. Appl..

[49]  B. Moore Principal component analysis in linear systems: Controllability, observability, and model reduction , 1981 .

[50]  Min Soo Kang,et al.  Clustering performance comparison using K-means and expectation maximization algorithms , 2014, Biotechnology, biotechnological equipment.

[51]  Derya Avci,et al.  An Expert Diagnosis System for Parkinson Disease Based on Genetic Algorithm-Wavelet Kernel-Extreme Learning Machine , 2016, Parkinson's disease.

[52]  Phayung Meesad,et al.  A highly accurate firefly based algorithm for heart disease prediction , 2015, Expert Syst. Appl..

[53]  Fevzullah Temurtas,et al.  A comparative study on diabetes disease diagnosis using neural networks , 2009, Expert Syst. Appl..

[54]  M. Hariharan,et al.  A new hybrid intelligent system for accurate detection of Parkinson's disease , 2014, Comput. Methods Programs Biomed..

[55]  Saeid Nahavandi,et al.  Classification of healthcare data using genetic fuzzy logic system and wavelets , 2015, Expert Syst. Appl..

[56]  Aytug Onan,et al.  A fuzzy-rough nearest neighbor classifier combined with consistency-based subset evaluation and instance selection for automated diagnosis of breast cancer , 2015, Expert Syst. Appl..

[57]  Chih-Chou Chiu,et al.  Hybrid intelligent modeling schemes for heart disease classification , 2014, Appl. Soft Comput..

[58]  Sankar K. Pal,et al.  Non-convex clustering using expectation maximization algorithm with rough set initialization , 2003, Pattern Recognit. Lett..

[59]  Chetan Patil,et al.  Heart Disease Diagnosis using Support Vector Machine , 2011 .

[60]  Esin Dogantekin,et al.  An intelligent diagnosis system for diabetes on Linear Discriminant Analysis and Adaptive Network Based Fuzzy Inference System: LDA-ANFIS , 2010, Digit. Signal Process..

[61]  Sang Won Yoon,et al.  Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms , 2014, Expert Syst. Appl..

[62]  Sujala Shetty,et al.  Improving accuracy in noninvasive telemonitoring of progression of Parkinson'S Disease using two-step predictive model , 2016, 2016 Third International Conference on Electrical, Electronics, Computer Engineering and their Applications (EECEA).

[63]  Petr Gajdos,et al.  Performance evaluation of Random Forest regression model in tracking Parkinson's disease progress , 2013, 13th International Conference on Hybrid Intelligent Systems (HIS 2013).

[64]  Kemal Polat,et al.  A new hybrid method based on fuzzy-artificial immune system and k-nn algorithm for breast cancer diagnosis , 2007, Comput. Biol. Medicine.

[65]  Chuen-Tsai Sun,et al.  Neuro-fuzzy modeling and control , 1995, Proc. IEEE.

[66]  Novruz Allahverdi,et al.  Design of a hybrid system for the diabetes and heart diseases , 2008, Expert Syst. Appl..

[67]  P. K. Anooj,et al.  Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules , 2012, J. King Saud Univ. Comput. Inf. Sci..

[68]  Arjen Ysbert Hoekstra,et al.  Assessment of uncertainties in expert knowledge, illustrated in fuzzy rule-based models , 2010 .

[69]  Resul Das,et al.  A comparison of multiple classification methods for diagnosis of Parkinson disease , 2010, Expert Syst. Appl..

[70]  Dietmar Jannach,et al.  Clustering- and regression-based multi-criteria collaborative filtering with incremental updates , 2015, Inf. Sci..

[71]  Murat Karabatak,et al.  A new classifier for breast cancer detection based on Naïve Bayesian , 2015 .

[72]  Joel Quintanilla-Domínguez,et al.  WBCD breast cancer database classification applying artificial metaplasticity neural network , 2011, Expert Syst. Appl..

[73]  Mehrbakhsh Nilashi,et al.  A Model for Detecting Customer Level Intentions to Purchase in B2C Websites Using TOPSIS and Fuzzy Logic Rule-Based System , 2013, Arabian Journal for Science and Engineering.

[74]  Jyoti Soni,et al.  Intelligent and Effective Heart Disease Prediction System using Weighted Associative Classifiers , 2011 .

[75]  Sai-Ho Ling,et al.  An Efficient Diagnosis System for Parkinson's Disease Using Deep Belief Network , 2017 .

[76]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[77]  Yi-Ping Phoebe Chen,et al.  Association rule mining to detect factors which contribute to heart disease in males and females , 2013, Expert Syst. Appl..

[78]  Shahram Delfani,et al.  Experimental investigation and modeling of thermal radiative properties of f-CNTs nanofluid by artificial neural network with Levenberg–Marquardt algorithm ☆ , 2016 .

[79]  Orhan Er,et al.  Use of artificial intelligence techniques for diagnosis of malignant pleural mesothelioma Malign plevral mezotelyoma tanısı için yapay zeka teknikleri kullanımı , 2015 .

[80]  Nawwaf N. Kharma,et al.  Advances in Detecting Parkinson's Disease , 2010, ICMB.

[81]  Kemal Polat,et al.  Breast cancer diagnosis using least square support vector machine , 2007, Digit. Signal Process..

[82]  Krisztian Buza,et al.  ParkinsoNET: Estimation of UPDRS Score Using Hubness-Aware Feedforward Neural Networks , 2016, Appl. Artif. Intell..

[83]  Ayman M. Eldeib,et al.  Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..