Predictive Analytics in Healthcare for Diabetes Prediction

Diabetes mellitus type 2 is a chronic disease which poses a serious challenge to human health worldwide. Globally, about 8.3% of the population is diagnosed with the disease. The applications of predictive analytics in diagnosis of diabetes are gaining significant momentum in medical research. The aim of this research paper is to aid medical professionals in the early detection and efficient diagnosis of Type 2 diabetes. We utilize bioinformatics theory and supervised machine learning techniques for improving the accuracy in predicting diabetes, based on 8 clinical measurements existing in the widely used PIMA dataset. We outline our methodology and highlight the implementation steps, while reviewing prominent past work in the field. Moreover, this paper fully exploits known machine learning algorithms and provides a detailed comparison of the results obtained from each method. The gradient boosting algorithm with parameter tuning proves to be the most successful, having an F1 Score of 0.853 and out of sample accuracy of 89.94%. Our prediction model focuses on computing the probability of the onset of diabetes in an individual based on their clinical data. The most crucial results of using this research within the healthcare sector are its cost-effectiveness and yielding of instant diagnosis. With this work, we intend to improve the process of diagnosing Type 2 diabetes and inspire other researchers to use machine learning based techniques for further inquiry into diabetes prediction.

[1]  Rajanikanth Aluvalu,et al.  Ranking with Distance based Outlier Detection Techniques: A Survey , 2014 .

[2]  S J Pöppl,et al.  Predicting Type 2 diabetes using an electronic nose-based artificial neural network analysis. , 2002, Diabetes, nutrition & metabolism.

[3]  S. Meo,et al.  Type 2 diabetes mellitus in Pakistan: Current prevalence and future forecast. , 2016, JPMA. The Journal of the Pakistan Medical Association.

[4]  K. Collins,et al.  Diabetes Numeracy and Blood Glucose Control: Association With Type of Diabetes and Source of Care , 2014, Clinical Diabetes.

[5]  Manal Alghamdi,et al.  Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project , 2017, PloS one.

[6]  Shankaracharya,et al.  Java-based diabetes type 2 prediction tool for better diagnosis. , 2012, Diabetes technology & therapeutics.

[7]  I. Vlahavas,et al.  Machine Learning and Data Mining Methods in Diabetes Research , 2017, Computational and structural biotechnology journal.

[8]  J. Shaw,et al.  IDF Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. , 2018, Diabetes research and clinical practice.

[9]  A Mayr,et al.  The Evolution of Boosting Algorithms , 2014, Methods of Information in Medicine.

[10]  A. Santhakumaran,et al.  A Novel Classification Method for Diagnosis of Diabetes Mellitus Using Artificial Neural Networks , 2010, 2010 International Conference on Data Storage and Data Engineering.

[11]  Ayush Anand,et al.  Prediction of diabetes based on personal lifestyle indicators , 2015, 2015 1st International Conference on Next Generation Computing Technologies (NGCT).

[12]  Emrana Kabir Hashi,et al.  An expert clinical decision support system to predict disease using classification techniques , 2017, 2017 International Conference on Electrical, Computer and Communication Engineering (ECCE).

[13]  S. Jeyalatha,et al.  Diagnosis of diabetes using classification mining techniques , 2015, ArXiv.

[14]  Florin Gorunescu,et al.  Data Mining - Concepts, Models and Techniques , 2011, Intelligent Systems Reference Library.

[15]  Shivananda R. Poojara,et al.  Predictive analysis of diabetic patient data using machine learning and Hadoop , 2017, 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC).

[16]  Mostafa Fathi Ganji,et al.  Using fuzzy ant colony optimization for diagnosis of diabetes disease , 2010, 2010 18th Iranian Conference on Electrical Engineering.

[17]  B. Keating Advances in Risk Prediction of Type 2 Diabetes: Integrating Genetic Scores With Framingham Risk Models , 2015, Diabetes.

[18]  S. Vijayarani,et al.  KIDNEY DISEASE PREDICTION , 2015 .

[19]  Appavu alias Balamurugan,et al.  Developing a Modified Logistic Regression Model for Diabetes Mellitus and Identifying the Important Factors of Type II Dm , 2016 .