A Novel Hybrid Model for Diabetic Prediction using Hidden Markov Model, Fuzzy based Rule Approach and Neural Network

Objectives: Data mining approaches are used for developing the decision making systems. The current study proposes a novel hybrid model for diabetic prediction by using data mining techniques. The main objective of this study is to improve the accuracy rate by significantly reducing the size of the data under analysis at every stage. Methods/Statistical Analysis: To achieve the objectives, the PIMA Female Diabetic dataset, extracted from UCI repository, is used. The 10-fold cross validation method is used for extracting the testing and the training samples. Three rank based selection techniques are used for the attribute selection. The association between different attributes is identified and then clustering is performed under criticality using HMM and Fuzzy improved Neural Network. Findings: The data size reduces significantly when appropriate selection methods are applied in the respective sequence. For categorical data, the gain ratio attribute selection method out performs. Clustering is more effective when performed after identifying the exact associations among attributes. The proposed hybrid model achieved 92% of overall accuracy. The blend of supervised and un-supervised techniques achieved better results than the techniques when applied individually on the same data, as figured by the comparative analysis. The earlier prediction models worked either on classification or clustering. But in this present study, the classifiers and the clustering are performed. The Fuzzy improved Neural Networks are used for predicting the diabetes disease over the data. The result analysis proved that the prediction accuracy is poor (Naive Bayes: 76.30%, Neural Networks: 75.13, Support Vector Machine: 77.47, K-Nearest neighbor: 69.79, Decision Tree (J48): 74.21), when the classifiers are implemented separately but when these are amalgamated with each other, produces better results. Application/ Improvements: The proposed hybrid model can be used as an expert system application, under the guidance of diabetic expert to assist the physicians for taking the decisions regarding the early diagnosis of the disease. In future, the proposed model can be applied on gender independent dataset. Further, the accuracy rate of the model can be improved by replacing the missing values of the dataset with the most appropriate value.

[1]  Shankaracharya,et al.  Computational intelligence in early diabetes diagnosis: a review. , 2010, The review of diabetic studies : RDS.

[2]  T. Karthikeyan,et al.  A Novel Algorithm to Diagnosis Type II Diabetes Mellitus Based on Association Rule Mining Using MPSO-LSSVM with Outlier Detection Method , 2015 .

[3]  John Sahaya Rani Alex,et al.  Performance Analysis of SOFM based Reduced Complexity Feature Extraction Methods with back Propagation Neural Network for Multilingual Digit Recognition , 2015 .

[4]  Blaz Zupan,et al.  Predictive data mining in clinical medicine: Current issues and guidelines , 2008, Int. J. Medical Informatics.

[5]  S. Harris,et al.  Prevalence, determinants and co-morbidities of chronic kidney disease among First Nations adults with diabetes: results from the CIRCLE study , 2012, BMC Nephrology.

[6]  Mohammad Saniee Abadeh,et al.  A fuzzy classification system based on Ant Colony Optimization for diabetes disease diagnosis , 2011, Expert Syst. Appl..

[7]  R. Benediktsson,et al.  Outcomes of educational interventions in type 2 diabetes: WEKA data-mining analysis. , 2007, Patient education and counseling.

[8]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[9]  Sun I. Kim,et al.  Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods , 2008, Artif. Intell. Medicine.

[10]  Allam Appa Rao,et al.  A computational intelligence approach for a better diagnosis of diabetic patients , 2014, Comput. Electr. Eng..

[11]  Durga Toshniwal,et al.  Hybrid prediction model for Type-2 diabetic patients , 2010, Expert Syst. Appl..

[12]  Chao-Ton Su,et al.  Data mining for the diagnosis of type II diabetes from three-dimensional body surface anthropometrical scanning data , 2006, Comput. Math. Appl..

[13]  F. Hu,et al.  The global implications of diabetes and cancer , 2014, The Lancet.

[14]  Mary K Obenshain Application of Data Mining Techniques to Healthcare Data , 2004, Infection Control & Hospital Epidemiology.

[15]  Jin Park,et al.  A sequential neural network model for diabetes prediction , 2001, Artif. Intell. Medicine.

[16]  Appavu alias Balamurugan,et al.  Developing a Modified Logistic Regression Model for Diabetes Mellitus and Identifying the Important Factors of Type II Dm , 2016 .

[17]  W. Kurutach,et al.  Association analysis of diabetes mellitus (DM) with complication states based on association rules , 2012, 2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA).

[18]  GanjiMostafa Fathi,et al.  A fuzzy classification system based on Ant Colony Optimization for diabetes disease diagnosis , 2011 .

[19]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[20]  K. Xiang,et al.  Prevalence and risk factors of albuminuria and chronic kidney disease in Chinese population with type 2 diabetes and impaired glucose regulation: Shanghai diabetic complications study (SHDCS). , 2009, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[21]  N. S. Gill,et al.  A COMPUTATIONAL HYBRID MODEL WITH TWO LEVEL CLASSIFICATION USING SVM AND NEURAL NETWORK FOR PREDICTING THE DIABETES DISEASE , 2016 .

[22]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.