Predicting Glycaemia in Type 1 Diabetes Patients: Experiments in Feature Engineering and Data Imputation

Patients with type 1 diabetes manually regulate blood glucose concentration by adjusting insulin dosage in response to factors such as carbohydrate intake and exercise intensity. Automated near-term prediction of blood glucose concentration is essential to prevent hyper- and hypoglycaemic events in type 1 diabetes patients and to improve control of blood glucose levels by physicians and patients. The imperfect nature of patient monitoring introduces missing values into all variables that play important roles to predict blood glucose level, necessitating data imputation. In this paper, we investigated the importance of variables and explored various feature engineering methods to predict blood glucose level. Next, we extended our work by developing a new empirical imputation method and investigating the predictive accuracy achieved under different methods to impute missing data. Also, we examined the influence of past signal values on the prediction of blood glucose levels. We reported the relative performance of predictive models in different testing scenarios and different imputation methods. Finally, we found an optimal combination of data imputation methods and built an ensemble model for the reliable prediction of blood glucose levels on a 30-minute horizon.

[1]  Alexander Schliep,et al.  Automatic Blood Glucose Prediction with Confidence Using Recurrent Neural Networks , 2018, KDH@IJCAI.

[2]  L. Gonder-Frederick,et al.  A critical review of the literature on fear of hypoglycemia in diabetes: Implications for diabetes management and patient education. , 2007, Patient education and counseling.

[3]  Gaurav Baruah,et al.  Predicting Glycemia in Type 1 Diabetes Patients: Experiments with XG-Boost , 2018, KHD@IJCAI.

[4]  M. Atkinson,et al.  The pathogenesis and natural history of type 1 diabetes. , 2012, Cold Spring Harbor perspectives in medicine.

[5]  Brent D. Cameron,et al.  Development of a Neural Network for Prediction of Glucose Concentration in Type 1 Diabetes Patients , 2008, Journal of diabetes science and technology.

[6]  Lyvia Biagi,et al.  Prediction of Blood Glucose Levels And Nocturnal Hypoglycemia Using Physiological Models and Artificial Neural Networks , 2018, KHD@IJCAI.

[7]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[8]  T. Krosshaug,et al.  Mechanisms for Noncontact Anterior Cruciate Ligament Injuries , 2010, The American journal of sports medicine.

[9]  Shichao Zhang,et al.  Parimputation: From Imputation and Null-Imputation to Partially Imputation , 2008, IEEE Intell. Informatics Bull..

[10]  A. Zeileis,et al.  zoo: S3 Infrastructure for Regular and Irregular Time Series , 2005, math/0505527.

[11]  Thomas Bartz-Beielstein,et al.  imputeTS: Time Series Missing Value Imputation in R , 2017, R J..

[12]  Jianwei Chen,et al.  A Deep Learning Algorithm for Personalized Blood Glucose Prediction , 2018, KHD@IJCAI.

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  Stanislas Chambon,et al.  A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[15]  Cynthia R. Marling,et al.  The OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020 , 2020, KDH@ECAI.

[16]  Peter Vamplew,et al.  Missing Values in a Backpropogation Neural Net , 2007 .

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Diana Lungeanu,et al.  Chaotic time series prediction for glucose dynamics in type 1 diabetes mellitus using regime-switching models , 2017, Scientific Reports.

[19]  M. Atkinson,et al.  Type 1 diabetes: new perspectives on disease pathogenesis and treatment , 2001, The Lancet.

[20]  Malinda Peeples,et al.  Hypoglycemia Prediction Using Machine Learning Models for Patients With Type 2 Diabetes , 2014, Journal of diabetes science and technology.

[21]  Monique Frize,et al.  Influence of Missing Values on Artificial Neural Network Performance , 2001, MedInfo.

[22]  Jianwei Chen,et al.  Dilated Recurrent Neural Network for Short-time Prediction of Glucose Concentration , 2018, KHD@IJCAI.

[23]  R. Giacco,et al.  Long-term dietary treatment with increased amounts of fiber-rich low-glycemic index natural foods improves blood glucose control and reduces the number of hypoglycemic events in type 1 diabetic patients. , 2000, Diabetes care.

[24]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[25]  David Rodbard,et al.  Continuous Glucose Monitoring: A Review of Successes, Challenges, and Opportunities. , 2016, Diabetes technology & therapeutics.

[26]  Ying Li,et al.  Numerical Solution of Continuous-State Dynamic Programs Using Linear and Spline Interpolation , 1993, Oper. Res..

[27]  Marcela Perrone-Bertolotti,et al.  Machine learning–XGBoost analysis of language networks to classify patients with epilepsy , 2017, Brain Informatics.

[28]  Lyvia Biagi,et al.  Using Grammatical Evolution to Generate Short-term Blood Glucose Prediction Models , 2018, KHD@IJCAI.

[29]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[30]  João Tovar Jalles,et al.  Structural Time Series Models and the Kalman Filter: A Concise Review , 2009 .

[31]  José Manuel Benítez,et al.  On the use of cross-validation for time series predictor evaluation , 2012, Inf. Sci..

[32]  Bruce Buckingham,et al.  Continuous Glucose Monitoring: Current Use and Future Directions , 2013, Current Diabetes Reports.