Enhancing Grammatical Evolution Through Data Augmentation: Application to Blood Glucose Forecasting

Currently, Diabetes Mellitus Type 1 patients are waiting hopefully for the arrival of the Artificial Pancreas (AP) in a near future. AP systems will control the blood glucose of people that suffer the disease, improving their lives and reducing the risks they face everyday. At the core of the AP, an algorithm will forecast future glucose levels and estimate insulin bolus sizes. Grammatical Evolution (GE) has been proved as a suitable algorithm for predicting glucose levels. Nevertheless, one the main obstacles that researches have found for training the GE models is the lack of significant amounts of data. As in many other fields in medicine, the collection of data from real patients is very complex. In this paper, we propose a data augmentation algorithm that generates synthetic glucose time series from real data. The synthetic time series can be used to train a unique GE model or to produce several GE models that work together in a combining system. Our experimental results show that, in a scarce data context, Grammatical Evolution models can get more accurate and robust predictions using data augmentation.

[1]  D. Cox,et al.  Evaluating Clinical Accuracy of Systems for Self-Monitoring of Blood Glucose , 1987, Diabetes Care.

[2]  Giuseppe De Nicolao,et al.  Model individualization for artificial pancreas , 2016, Comput. Methods Programs Biomed..

[3]  Lenore Cowen,et al.  Augmented training of hidden Markov models to recognize remote homologs via simulated evolution , 2009, Bioinform..

[4]  Giuseppe De Nicolao,et al.  Modeling and Control of Diabetes: Towards the Artificial Pancreas , 2011 .

[5]  Martin Pelikan,et al.  Marginal Distributions in Evolutionary Algorithms , 2007 .

[6]  Richard L. Jones,et al.  Cost analysis of intensive glycemic control in critically ill adult patients. , 2006, Chest.

[7]  Heinz Mühlenbein,et al.  The Equation for Response to Selection and Its Use for Prediction , 1997, Evolutionary Computation.

[8]  Michael O'Neill,et al.  Grammatical evolution - evolutionary automatic programming in an arbitrary language , 2003, Genetic programming.

[9]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[10]  Michel Gevers,et al.  Identification for Control: From the Early Achievements to the Revival of Experiment Design , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[11]  Lovekesh Vig,et al.  ODE - Augmented Training Improves Anomaly Detection in Sensor Data from Machines , 2016, ArXiv.

[12]  Lucy Mays,et al.  Diabetes Mellitus Standards of Care. , 2015, The Nursing clinics of North America.

[13]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[14]  Martin A. Tanner,et al.  From EM to Data Augmentation: The Emergence of MCMC Bayesian Computation in the 1980s , 2010, 1104.2210.

[15]  George Papadakis,et al.  Comparative analysis of a-priori and a-posteriori dietary patterns using state-of-the-art classification algorithms: A case/case-control study , 2013, Artif. Intell. Medicine.

[16]  Chunhui Zhao,et al.  Rapid Model Identification for Online Glucose Prediction of New Subjects With Type 1 Diabetes Using Model Migration Method , 2014 .

[17]  José Ignacio Hidalgo,et al.  glUCModel: A monitoring and modeling system for chronic diseases applied to diabetes , 2014, J. Biomed. Informatics.