Machine Learning Methods for Diabetes Prevalence Classification in Saudi Arabia

Machine learning algorithms have been widely used in public health for predicting or diagnosing epidemiological chronic diseases, such as diabetes mellitus, which is classified as an epi-demic due to its high rates of global prevalence. Machine learning techniques are useful for the processes of description, prediction, and evaluation of various diseases, including diabetes. This study investigates the ability of different classification methods to classify diabetes prevalence rates and the predicted trends in the disease according to associated behavioural risk factors (smoking, obesity, and inactivity) in Saudi Arabia. Classification models for diabetes prevalence were developed using different machine learning algorithms, including linear discriminant (LD), support vector machine (SVM), K -nearest neighbour (KNN), and neural network pattern recognition (NPR). Four kernel functions of SVM and two types of KNN algorithms were used, namely linear SVM, Gaussian SVM, quadratic SVM, cubic SVM, fine KNN, and weighted KNN. The performance evaluation in terms of the accuracy of each developed model was determined, and the developed classifiers were compared using the Classification Learner App in MATLAB, according to prediction speed and training time. The experimental results on the predictive performance analysis of the classification models showed that weighted KNN performed well in the prediction of diabetes prevalence rate, with the highest average accuracy of 94.5% and less training time than the other classification methods, for both men and women datasets.

[1]  D. Berleant,et al.  Discrete-Event Simulation in Healthcare Settings: a Review , 2022, Modelling.

[2]  Jin-liang Wang,et al.  Machine Learning for Organic Photovoltaic Polymers: A Minireview , 2022, Chinese Journal of Polymer Science.

[3]  A. Mahmood,et al.  Developing efficient small molecule acceptors with sp2-hybridized nitrogen at different positions by density functional theory calculations, molecular dynamics simulations and machine learning. , 2021, Chemistry.

[4]  Md. Khairul Islam,et al.  A Comparative Analysis of Early Stage Diabetes Prediction using Machine Learning and Deep Learning Approach , 2021, 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC).

[5]  Amjed Al-Mousa,et al.  Diabetes Detection Using Machine Learning Classification Methods , 2021, 2021 International Conference on Information Technology (ICIT).

[6]  Abdu Gumaei,et al.  An Improved Artificial Neural Network Model for Effective Diabetes Prediction , 2021, Complex..

[7]  A. Sheikh,et al.  Early detection of type 2 diabetes mellitus using machine learning-based prediction models , 2020, Scientific Reports.

[8]  K. Höllig,et al.  Matlab® , 2020, Aufgaben und Lösungen zur Höheren Mathematik 1.

[9]  Eklas Hossain,et al.  Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers , 2020, IEEE Access.

[10]  M. Abbod,et al.  Mathematical Modelling of Diabetes Mellitus and Associated Risk Factors in Saudi Arabia , 2020 .

[11]  Krittika Kantawong,et al.  The Methodology for Diabetes Complications Prediction Model , 2020, 2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON).

[12]  A. Guergachi,et al.  Predictive models for diabetes mellitus using machine learning techniques , 2019, BMC Endocrine Disorders.

[13]  Aman Jantan,et al.  Comprehensive Review of Artificial Neural Network Applications to Pattern Recognition , 2019, IEEE Access.

[14]  Iqbal H. Sarker,et al.  Performance Analysis of Machine Learning Techniques to Predict Diabetes Mellitus , 2019, 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE).

[15]  Ying Ju,et al.  Predicting Diabetes Mellitus With Machine Learning Techniques , 2018, Front. Genet..

[16]  Sharvari Chandrashekhar Tamane,et al.  A Comparative Analysis on the Evaluation of Classification Algorithms in the Prediction of Diabetes , 2018, International Journal of Electrical and Computer Engineering (IJECE).

[17]  Riccardo Bellazzi,et al.  Machine Learning Methods to Predict Diabetes Complications , 2018, Journal of diabetes science and technology.

[18]  Matteo Fischetti,et al.  Fast training of Support Vector Machines with Gaussian kernel , 2016, Discret. Optim..

[19]  Gretchen A. Stevens,et al.  Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4·4 million participants , 2016, Lancet.

[20]  J. Shaw,et al.  Global estimates of diabetes prevalence for 2013 and projections for 2035. , 2014, Diabetes Research and Clinical Practice.

[21]  G. Rossi Diagnosis and Classification of Diabetes Mellitus , 2011, Diabetes Care.

[22]  P. O S I T I O N S T A T E M E N T,et al.  Diagnosis and Classification of Diabetes Mellitus , 2011, Diabetes Care.

[23]  Haesun Park,et al.  A comparison of generalized linear discriminant analysis algorithms , 2008, Pattern Recognit..

[24]  J. Frenk Disease control priorities in developing countries , 2006 .

[25]  M. Khalil,et al.  Obesity in Saudi Arabia. , 2005, Saudi medical journal.

[26]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[27]  M. El-Hazmi,et al.  Diabetes mellitus, hypertension and obesity--common multifactorial disorders in Saudis. , 1999, Eastern Mediterranean health journal = La revue de sante de la Mediterranee orientale = al-Majallah al-sihhiyah li-sharq al-mutawassit.

[28]  A. Al-Nuaim,et al.  Prevalence and determinants of smoking in three regions of Saudi Arabia , 1999, Tobacco control.

[29]  D. Jamison,et al.  Disease Control Priorities in Developing Countries , 1993 .

[30]  C. Dolea,et al.  World Health Organization , 1949, International Organization.

[31]  Melbourne Victoria,et al.  A Ministry of Health , 1917, Nature.

[32]  Tai-hoon Kim,et al.  Use of Artificial Neural Network in Pattern Recognition , 2010 .

[33]  Kari Torkkola,et al.  Linear Discriminant Analysis in Document Classification , 2007 .

[34]  Milton C Weinstein,et al.  Principles of good practice for decision analytic modeling in health-care evaluation: report of the ISPOR Task Force on Good Research Practices--Modeling Studies. , 2003, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[35]  F Reed Johnson,et al.  Modeling for health care and other policy decisions: uses, roles, and validity. , 2002, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.