Diabetes Diseases Prediction Using Supervised Machine Learning and Neighbourhood Components Analysis

Diabetes mellitus (DM) is a chronic disease, which can affect the entire body system. Early Diagnosis of patient's diabetics can help improve their health quality or reducing the risk factors. The main objective of this study is to evaluate the performance of some Machine Learning algorithms, used to predict diabetes diseases, for this purpose we apply and evaluate four Machine Learning algorithms (Decision Tree, K-Nearest Neighbours, Artificial Neural Network and Deep Neural Network) to predict diabetes mellitus. These techniques have been trained and tested on Pima Indian dataset. The performances of the experimented algorithms have been evaluated after removing noisy data and using features selection with Neighbourhood components Analysis in order to reduce the number of features and mitigate the complexity of dimensionality in favour of speeds up the learning process, enhances data understanding. Different similarity metrics used to compare model performance like Accuracy, Sensitivity, and Specificity.

[1]  Ying Ju,et al.  Predicting Diabetes Mellitus With Machine Learning Techniques , 2018, Front. Genet..

[2]  Grant T. Harris,et al.  Comparing Effect Sizes in Follow-Up Studies: ROC Area, Cohen's d, and r , 2005, Law and human behavior.

[3]  Oumaima Terrada,et al.  A fuzzy medical diagnostic support system for cardiovascular diseases diagnosis using risk factors , 2018, 2018 International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS).

[4]  Dilip Singh Sisodia,et al.  Prediction of Diabetes using Classification Algorithms , 2018 .

[5]  Oumaima Terrada,et al.  Fuzzy cardiovascular diagnosis system using clinical data , 2018, 2018 4th International Conference on Optimization and Applications (ICOA).

[6]  J. Thakur,et al.  Prevalence and risk factors of diabetes in a community-based study in North India: the Chandigarh Urban Diabetes Study (CUDS). , 2011, Diabetes & metabolism.

[7]  Isaac Subirana,et al.  Risk of Cause-Specific Death in Individuals With Diabetes: A Competing Risks Analysis , 2016, Diabetes Care.

[8]  Harleen Kaur,et al.  Predictive modelling and analytics for diabetes using a machine learning approach , 2020, Applied Computing and Informatics.

[9]  Simone Melzi,et al.  Object Tracking via Dynamic Feature Selection Processes , 2016, ArXiv.

[10]  H. Keen,et al.  Mortality and causes of death in the WHO multinational study of vascular disease in diabetes , 2001, Diabetologia.

[11]  J. Shaw,et al.  IDF Diabetes Atlas: Global estimates of diabetes prevalence for 2017 and projections for 2045. , 2018, Diabetes research and clinical practice.

[12]  Bouchaib Cherradi,et al.  Machine Learning based System for Prediction of Breast Cancer Severity , 2019, 2019 International Conference on Wireless Networks and Mobile Communications (WINCOM).

[13]  Ibrahim Mohamed Ahmed Ali,et al.  Knowledge Acquisition for an Expert System for Diabetic , 2017, SCA.

[14]  Bouchaib Cherradi,et al.  Type 2 Diabetes Mellitus Prediction Model Based on Machine Learning Approach , 2020, Innovations in Smart Cities Applications Edition 3.

[15]  Wei Yang,et al.  Neighborhood Component Feature Selection for High-Dimensional Data , 2012, J. Comput..

[16]  Bouchaib Cherradi,et al.  Predicting diabetes diseases using mixed data and supervised machine learning algorithms , 2019, SCA.

[17]  S. Balamurali,et al.  Performance Analysis of Classifier Models to Predict Diabetes Mellitus , 2015 .

[18]  Marco Cristani,et al.  Infinite Feature Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).