Investigation Of Diabetes Data with Permutation Feature Importance Based Deep Learning Methods

Diabetes is a metabolic disease that occurs due to high blood sugar levels in the body. If it is not treated, diabetes-related health problems may occur in many vital organs of the body. With the latest techniques in machine learning technologies, some of the applications can be used to diagnose diabetes at an early stage. In this study, the data set from the laboratories of Medical City Hospital Endocrinology and Diabetes Specialization Center Al Kindy Training Hospital was used. The dataset consists of 3 different classes: normal, pre-diabetes and diabetes. The obtained diabetes dataset was classified using Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) deep learning methods. The classification performance of each algorithm was evaluated with accuracy, precision, sensitivity and F score performance parameters. Among the deep learning methods, 96.5% classification accuracy was obtained with the LSTM algorithm, 94% with the CNN algorithm and 93% with the GRU algorithm. In this study, the Permutation Feature Importance (PFI) method was also used to determine the effect of features in the data set on classification performance. With this method, study reveals that the HbA1c feature is an important parameter in the used deep learning methods. Both the results obtained with the LSTM algorithm and the determination of the most important feature affecting the classification success reveal the originality of the study. It shows that the obtained results will provide healthcare professionals with a prognostic tool for effective decision-making that can assist in the early detection of the disease.

[1]  A. Bouzouane,et al.  Machine Learning and Smart Devices for Diabetes Management: Systematic Review , 2022, Sensors.

[2]  M. Mansournia,et al.  Diabetes mellitus risk prediction in the presence of class imbalance using flexible machine learning methods , 2022, BMC Medical Informatics and Decision Making.

[3]  Shuai Li,et al.  An Enhanced GRU Model With Application to Manipulator Trajectory Tracking , 2022, EAI Endorsed Transactions on AI and Robotics.

[4]  A. Çifci,et al.  Forecasting of Turkey’s Electrical Energy Consumption using LSTM and GRU Networks , 2021, Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi.

[5]  N. Savaş,et al.  Birinci Basamak Merkez Laboratuvarı HbA1c Verilerine Göre XXXX’da Glisemik Kontrol Durumu ve İlişkili Faktörler , 2021, Türkiye Halk Sağlığı Dergisi.

[6]  Mamta Mittal,et al.  An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier , 2021 .

[7]  M. Er,et al.  LSTM TABANLI DERİN AĞLAR KULLANILARAK DİYABET HASTALIĞI TAHMİNİ , 2021, Türk Doğa ve Fen Dergisi.

[8]  Divish Rengasamy,et al.  Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems using Feature Importance Fusion , 2020, Applied Sciences.

[9]  A. Bajahzar,et al.  Classification of Diabetes Using Photoplethysmogram (PPG) Waveform Analysis: Logistic Regression Modeling , 2020, BioMed research international.

[10]  S. Tayebati,et al.  Comparative Machine-Learning Approach: A Follow-Up Study on Type 2 Diabetes Predictions by Cross-Validation Methods , 2019, Machines.

[11]  Yang Yin,et al.  Hybrid LSTM Neural Network for Short-Term Traffic Flow Prediction , 2019, Inf..

[12]  Alaa El. Sagheer,et al.  Time series forecasting of petroleum production using deep LSTM recurrent networks , 2019, Neurocomputing.

[13]  Thomas Fischer,et al.  Deep learning with long short-term memory networks for financial market predictions , 2017, Eur. J. Oper. Res..

[14]  Yitao Liu,et al.  Deterministic and probabilistic forecasting of photovoltaic power based on deep convolutional neural network , 2017 .

[15]  Min Peng,et al.  NIRFaceNet: A Convolutional Neural Network for Near-Infrared Face Identification , 2016, Inf..

[16]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Daniel Asante Otchere,et al.  Enhancing Drilling Fluid Lost-Circulation Prediction Using Model Agnostic and Supervised Machine Learning , 2022, SSRN Electronic Journal.

[18]  R. Vinayakumar,et al.  Automated detection of diabetes using CNN and CNN-LSTM network and heart rate signals , 2018 .

[19]  Dilip Singh Sisodia,et al.  Prediction of Diabetes using Classification Algorithms , 2018 .

[20]  S. Balamurali,et al.  Performance Analysis of Classifier Models to Predict Diabetes Mellitus , 2015 .

[21]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..