RG Hyperparameter Optimization Approach for Improved Indirect Prediction of Blood Glucose Levels by Boosting Ensemble Learning

This paper proposes an RG hyperparameter optimization approach, based on a sequential use of random search (R) and grid search (G), for improving the blood glucose level prediction of boosting ensemble learning models. An indirect prediction of blood glucose levels in patients is performed, based on historical medical data collected by means of physical examination methods, using 40 human body’s health indicators. The conducted experiments with real clinical data proved that the proposed RG double optimization approach helps improve the prediction performance of four state-of-the-art boosting ensemble learning models enriched by it, achieving 1.47% to 24.40% MSE improvement and 0.75% to 11.54% RMSE improvement.

[1]  Zbigniew Leonowicz,et al.  A Hybrid Supervised Machine Learning Classifier System for Breast Cancer Prognosis Using Feature Selection and Data Imbalance Handling Approaches , 2021, Electronics.

[2]  Q. Bi,et al.  Potential role of liver enzymes for predicting elevated blood glucose levels , 2021 .

[3]  X. Tejedor,et al.  Performance evaluation of the new hematology analyzer UniCel DxH 900 , 2021, International journal of laboratory hematology.

[4]  R. Khadgawat,et al.  Glucose measurement in body fluids: A ready reckoner for clinicians. , 2020, Diabetes & metabolic syndrome.

[5]  Abbas Barfidokht,et al.  Electrochemical glucose sensors in diabetes management: an updated review (2010-2020). , 2020, Chemical Society reviews.

[6]  Yan Wang,et al.  Application of Improved LightGBM Model in Blood Glucose Prediction , 2020, Applied Sciences.

[7]  G. Ning,et al.  Prevalence of diabetes recorded in mainland China using 2018 diagnostic criteria from the American Diabetes Association: national cross sectional study , 2020, BMJ.

[8]  Yongtong Cao,et al.  The Dose-Response Relationship between Gamma-Glutamyl Transferase and Risk of Diabetes Mellitus Using Publicly Available Data: A Longitudinal Study in Japan , 2020, International journal of endocrinology.

[9]  S. Wild,et al.  Mortality attributable to diabetes in 20-79 years old adults, 2019 estimates: results from the International Diabetes Federation Diabetes Atlas, 9th edition. , 2020, Diabetes research and clinical practice.

[10]  Rida Al-Adamat,et al.  Spatial mapping of groundwater springs potentiality using grid search-based and genetic algorithm-based support vector regression , 2020, Geocarto International.

[11]  Qiang Sun,et al.  Adaptive Huber Regression , 2017, Journal of the American Statistical Association.

[12]  Wenan Tan,et al.  A Bagging-GBDT ensemble learning model for city air pollutant concentration prediction , 2019, IOP Conference Series: Earth and Environmental Science.

[13]  Amin M. Abbosh,et al.  The Progress of Glucose Monitoring—A Review of Invasive to Minimally and Non-Invasive Techniques, Devices and Sensors , 2019, Sensors.

[14]  Dharmik S. Patel,et al.  Relationship between Dyslipidemia and Glycemic Status in Type-2 Diabetes Mellitus , 2019, NATIONAL JOURNAL OF LABORATORY MEDICINE.

[15]  Guoyin Wang,et al.  Transferring Ensemble Representations Using Deep Convolutional Neural Networks for Small-Scale Image Classification , 2019, IEEE Access.

[16]  M. C. Izquierdo,et al.  Unexplained reciprocal regulation of diabetes and lipoproteins , 2018, Current opinion in lipidology.

[17]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[18]  Khawar Khurshid,et al.  An expert system for diabetes prediction using auto tuned multi-layer perceptron , 2017, 2017 Intelligent Systems Conference (IntelliSys).

[19]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Dr. S. Vijayarani,et al.  Liver Disease Prediction using SVM and Naïve Bayes Algorithms , 2015 .

[22]  T. Santhanam,et al.  Application of K-Means and Genetic Algorithms for Dimension Reduction by Integrating SVM for Diabetes Diagnosis , 2015 .

[23]  Punnee Sittidech,et al.  Ensemble Learning Model for Diabetes Classification , 2014 .

[24]  H. Nam,et al.  The association between liver enzymes and risk of type 2 diabetes: the Namwon study , 2014, Diabetology & Metabolic Syndrome.

[25]  T. E. Hodhod,et al.  Could Liver Functions Predict Type 2 Diabetes Mellitus in Young Obese Men in Najran , Saudi Arabia ? , 2013 .

[26]  David C Klonoff,et al.  Overview of Fluorescence Glucose Sensing: A Technology with a Bright Future , 2012, Journal of diabetes science and technology.

[27]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[28]  Yixin Fang,et al.  A note on the generalized degrees of freedom under the L1 loss function , 2011 .

[29]  J. Preiser,et al.  International recommendations for glucose control in adult non diabetic critically ill patients , 2010, Critical care.

[30]  E. Bonora,et al.  Usefulness of the triglyceride to high-density lipoprotein cholesterol ratio for predicting mortality risk in type 2 diabetes: role of kidney dysfunction. , 2010, Atherosclerosis.

[31]  Bilal H Malik,et al.  Real-time, closed-loop dual-wavelength optical polarimetry for glucose monitoring. , 2010, Journal of biomedical optics.

[32]  Hong-Kyu Kim,et al.  Association of serum γ‐glutamyltransferase and alanine aminotransferase activities with risk of type 2 diabetes mellitus independent of fatty liver , 2009, Diabetes/metabolism research and reviews.

[33]  Sellappan Palaniappan,et al.  Intelligent heart disease prediction system using data mining techniques , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[34]  W. Clarke The original Clarke Error Grid Analysis (EGA). , 2005, Diabetes technology & therapeutics.

[35]  G. W. Small,et al.  Noninvasive glucose sensing. , 2005, Analytical chemistry.

[36]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[37]  L. Bouter,et al.  A combination of high concentrations of serum triglyceride and non-high-density-lipoprotein-cholesterol is a risk factor for cardiovascular disease in subjects with abnormal glucose metabolism—The Hoorn Study , 2003, Diabetologia.

[38]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[39]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[40]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[41]  J. Kampert,et al.  The Association between Cardiorespiratory Fitness and Impaired Fasting Glucose and Type 2 Diabetes Mellitus in Men , 1999, Annals of Internal Medicine.

[42]  M A Arnold,et al.  Phantom glucose calibration models from simulated noninvasive human near-infrared spectra. , 1998, Analytical chemistry.

[43]  J V Tu,et al.  Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. , 1996, Journal of clinical epidemiology.

[44]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[45]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.