Performance Analysis of Data Mining Classification Techniques to Predict Diabetes

Abstract Diabetes Mellitus is one of the major health challenges all over the world. The prevalence of diabetes is increasing at a fast pace, deteriorating human, economic and social fabric. Prevention and prediction of diabetes mellitus is increasingly gaining interest in healthcare community. Although several clinical decision support systems have been proposed that incorporate several data mining techniques for diabetes prediction and course of progression. These conventional systems are typically based either just on a single classifier or a plain combination thereof. Recently extensive endeavors are being made for improving the accuracy of such systems using ensemble classifiers. This study follows the adaboost and bagging ensemble techniques using J48 (c4.5) decision tree as a base learner along with standalone data mining technique J48 to classify patients with diabetes mellitus using diabetes risk factors. This classification is done across three different ordinal adults groups in Canadian Primary Care Sentinel Surveillance network. Experimental result shows that, overall performance of adaboost ensemble method is better than bagging as well as standalone J48 decision tree.

[1]  Sungzoon Cho,et al.  An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction , 2015, Expert Syst. Appl..

[2]  Karim Keshavjee,et al.  Evaluating the performance of the Framingham Diabetes Risk Scoring Model in Canadian electronic medical records. , 2015, Canadian journal of diabetes.

[3]  G. Agarwal,et al.  Validating the CANRISK prognostic model for assessing diabetes risk in Canada's multi-ethnic population. , 2011, Chronic diseases and injuries in Canada.

[4]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[5]  N. Lou,et al.  Evaluation of a risk factor scoring model in screening for undiagnosed diabetes in China population , 2011, Journal of Zhejiang University SCIENCE B.

[6]  J. Lindström,et al.  Tools for Predicting the Risk of Type 2 Diabetes in Daily Practice , 2008, Hormone and metabolic research = Hormon- und Stoffwechselforschung = Hormones et metabolisme.

[7]  S. Balamurali,et al.  Performance Analysis of Classifier Models to Predict Diabetes Mellitus , 2015 .

[8]  S Vijiyarani,et al.  DISEASE PREDICTION IN DATA MINING TECHNIQUE – A SURVEY , 2013 .

[9]  Rian Budi Lukmanto,et al.  The Early Detection of Diabetes Mellitus (DM) Using Fuzzy Hierarchical Model , 2015 .

[10]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[11]  Deok Won Kim,et al.  Screening for Prediabetes Using Machine Learning Models , 2014, Comput. Math. Methods Medicine.

[12]  Andrew P. Bradley,et al.  Intelligible Support Vector Machines for Diagnosis of Diabetes Mellitus , 2010, IEEE Transactions on Information Technology in Biomedicine.

[13]  Claire Infante-Rivard,et al.  Chronic Diseases and Injuries in Canada , 2011 .

[14]  E. David Crawford,et al.  Expert Review of Pharmacoeconomics & Outcomes Research , 2013 .

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  C. Giorda,et al.  The impact of diabetes mellitus on healthcare costs in Italy , 2011, Expert review of pharmacoeconomics & outcomes research.