On-line policy learning and adaptation for real-time personalization of an artificial pancreas

The importance of tailoring an artificial pancreas to a given patient is addressed.On-line policy learning integrates reinforcement learning with Gaussian processes.Only relevant data is used to update the control policy.Fast policy adaptation allows dealing with patient-specific glycemic variability. The dynamic complexity of the glucose-insulin metabolism in diabetic patients is the main obstacle towards widespread use of an artificial pancreas. The significant level of subject-specific glycemic variability requires continuously adapting the control policy to successfully face daily changes in patient's metabolism and lifestyle. In this paper, an on-line selective reinforcement learning algorithm that enables real-time adaptation of a control policy based on ongoing interactions with the patient so as to tailor the artificial pancreas is proposed. Adaptation includes two online procedures: on-line sparsification and parameter updating of the Gaussian process used to approximate the control policy. With the proposed sparsification method, the support data dictionary for on-line learning is modified by checking if in the arriving data stream there exists novel information to be added to the dictionary in order to personalize the policy. Results obtained in silico experiments demonstrate that on-line policy learning is both safe and efficient for maintaining blood glucose variability within the normoglycemic range.

[1]  E. Daskalaki,et al.  Real-time adaptive models for the personalized prediction of glycemic profile in type 1 diabetes patients. , 2012, Diabetes technology & therapeutics.

[2]  I.M.Y. Mareels,et al.  An adaptive expert system for blood glucose control in type 1 diabetes mellitus , 1999, Proceedings of the First Joint BMES/EMBS Conference. 1999 IEEE Engineering in Medicine and Biology 21st Annual Conference and the 1999 Annual Fall Meeting of the Biomedical Engineering Society (Cat. N.

[3]  Eyal Dassau,et al.  Annals of the New York Academy of Sciences the Artificial Pancreas: Current Status and Future Prospects in the Management of Diabetes , 2022 .

[4]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[5]  Amir-Masoud Eftekhari-Moghadam,et al.  Knowledge discovery in medicine: Current issue and future trend , 2014, Expert Syst. Appl..

[6]  Peter Stone,et al.  RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control , 2011, 2012 IEEE International Conference on Robotics and Automation.

[7]  L. Magni,et al.  Model Predictive Control of Type 1 Diabetes: An in Silico Trial , 2007, Journal of diabetes science and technology.

[8]  Lutz Heinemann,et al.  Variability of insulin absorption and insulin action. , 2002, Diabetes technology & therapeutics.

[9]  Y. Z. Ider,et al.  Quantitative estimation of insulin sensitivity. , 1979, The American journal of physiology.

[10]  Carl E. Rasmussen,et al.  Gaussian process dynamic programming , 2009, Neurocomputing.

[11]  R. Bergman,et al.  Physiologic evaluation of factors controlling glucose tolerance in man: measurement of insulin sensitivity and beta-cell glucose sensitivity from the response to intravenous glucose. , 1981, The Journal of clinical investigation.

[12]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[13]  H. Vorster,et al.  Intra- and inter-individual variation in blood glucose response to white bread and glucose in patients with type 2 diabetes mellitus , 2003 .

[14]  Frits Holleman,et al.  Glucose variability; does it matter? , 2010, Endocrine reviews.

[15]  Kiyosi Itô Stochastic Differential Equations , 2018, The Control Systems Handbook.

[16]  B. Bequette A critical assessment of algorithms and challenges in the development of a closed-loop artificial pancreas. , 2005, Diabetes technology & therapeutics.

[17]  Javad Akbari Torkestani,et al.  A learning automata-based blood glucose regulation mechanism in type 2 diabetes , 2014 .

[18]  Eyal Dassau,et al.  Zone Model Predictive Control: A Strategy to Minimize Hyper- and Hypoglycemic Events , 2010, Journal of diabetes science and technology.

[19]  Ali Cinar,et al.  Adaptive control strategy for regulation of blood glucose levels in patients with type 1 diabetes , 2009 .

[20]  R. Hovorka,et al.  Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes. , 2004, Physiological measurement.

[21]  Mihalis G. Markakis,et al.  Nonparametric Modeling and Model-Based Control of the Insulin-Glucose System , 2010 .

[22]  Yang Gao,et al.  Online Selective Kernel-Based Temporal Difference Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[23]  B Guerci,et al.  Treatment of diabetes mellitus using an external insulin pump: the state of the art. , 2008, Diabetes & metabolism.

[24]  Giuseppe De Nicolao,et al.  Model predictive control of glucose concentration in type I diabetic patients: An in silico trial , 2009, Biomed. Signal Process. Control..

[25]  Roman Hovorka,et al.  Pharmacokinetics of insulin lispro in type 2 diabetes during closed-loop insulin delivery , 2014, Comput. Methods Programs Biomed..

[26]  Ahmed Y. Ben Sasi,et al.  Design and Analysis of a Sliding Table Controller for Diabetes , 2013 .

[27]  Mohamed Adel Serhani,et al.  An Adaptive Expert System for Automated Advices Generation-Based Semi-continuous M-Health Monitoring , 2014, Brain Informatics and Health.

[28]  Giorgio Metta,et al.  Real-time model learning using Incremental Sparse Spectrum Gaussian Process Regression. , 2013, Neural networks : the official journal of the International Neural Network Society.

[29]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[30]  Efstratios N. Pistikopoulos,et al.  MPC on a chip - Recent advances on the application of multi-parametric model-based control , 2008, Comput. Chem. Eng..

[31]  Cynthia R. Marling,et al.  Characterizing Blood Glucose Variability Using New Metrics with Continuous Glucose Monitoring Data , 2011, Journal of diabetes science and technology.

[32]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[33]  Darrell M. Wilson,et al.  A Closed-Loop Artificial Pancreas Using Model Predictive Control and a Sliding Meal Size Estimator , 2009, Journal of diabetes science and technology.

[34]  J. Leahy,et al.  Fully Automated Closed-Loop Insulin Delivery Versus Semiautomated Hybrid Control in Pediatric Patients With Type 1 Diabetes Using an Artificial Pancreas , 2008 .

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Manfred Morari,et al.  Model predictive control: Theory and practice - A survey , 1989, Autom..

[37]  E D Lehmann,et al.  A physiological model of glucose-insulin interaction in type 1 diabetes mellitus. , 1992, Journal of biomedical engineering.

[38]  E. Martínez,et al.  Optimal Operation of Discretely Controlled Continuous Systems under Uncertainty , 2012 .

[39]  John Thomas Sorensen,et al.  A physiologic model of glucose metabolism in man and its use to design and assess improved insulin therapies for diabetes , 1985 .

[40]  Saadet Ulas Acikgoz,et al.  Blood glucose regulation with stochastic optimal control for insulin-dependent diabetic patients , 2008 .

[41]  Olivier Sigaud,et al.  On-line regression algorithms for learning mechanical models of robots: A survey , 2011, Robotics Auton. Syst..

[42]  Bartolomeo Cosenza,et al.  Off-line control of the postprandial glycemia in type 1 diabetes patients by a fuzzy logic decision support , 2012, Expert Syst. Appl..

[43]  F. Chee,et al.  Expert PID control system for blood glucose control in critically ill patients , 2003, IEEE Transactions on Information Technology in Biomedicine.

[44]  I. Verdinelli,et al.  Bayesian designs for maximizing information and outcome , 1992 .

[45]  Andreas Krause,et al.  Nonmyopic active learning of Gaussian processes: an exploration-exploitation approach , 2007, ICML '07.

[46]  Justin A. Gantt,et al.  TYPE 1 DIABETIC PATIENT INSULIN DELIVERY USING ASYMMETRIC PI CONTROL , 2007 .

[47]  A. Aldo Faisal,et al.  The use of reinforcement learning algorithms to meet the challenges of an artificial pancreas , 2013, Expert review of medical devices.

[48]  K. Turksoy,et al.  Multivariable adaptive closed-loop control of an artificial pancreas without meal and activity announcement. , 2013, Diabetes technology & therapeutics.

[49]  Ali Cinar,et al.  Adaptive control of artificial pancreas systems - a review. , 2014, Journal of healthcare engineering.

[50]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[51]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[52]  Jan Peters,et al.  Incremental online sparsification for model learning in real-time robot control , 2011, Neurocomputing.

[53]  C. Cobelli,et al.  Artificial Pancreas: Past, Present, Future , 2011, Diabetes.

[54]  Tarunraj Singh,et al.  Blood glucose control algorithms for type 1 diabetic patients: A methodological review , 2013, Biomed. Signal Process. Control..

[55]  B. Wayne Bequette,et al.  Challenges and recent progress in the development of a closed-loop artificial pancreas , 2012, Annu. Rev. Control..

[56]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[57]  Dale E. Seborg,et al.  An Improved PID Switching Control Strategy for Type 1 Diabetes , 2008, IEEE Transactions on Biomedical Engineering.

[58]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[59]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[60]  Bruce Buckingham,et al.  Real-time continuous glucose monitoring , 2007, Current opinion in endocrinology, diabetes, and obesity.

[61]  C. C. Palerm,et al.  A Run-to-Run Control Strategy to Adjust Basal Insulin Infusion Rates in Type 1 Diabetes. , 2008, Journal of process control.

[62]  Stavroula G. Mougiakakou,et al.  An Actor-Critic based controller for glucose regulation in type 1 diabetes , 2013, Comput. Methods Programs Biomed..

[63]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[64]  Stavroula G. Mougiakakou,et al.  Personalized tuning of a reinforcement learning control algorithm for glucose regulation , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[65]  Mariano De Paula,et al.  Probabilistic optimal control of blood glucose under uncertainty , 2012 .

[66]  Marc Peter Deisenroth,et al.  Efficient reinforcement learning using Gaussian processes , 2010 .

[67]  Efstratios N. Pistikopoulos,et al.  Model-based blood glucose control for type 1 diabetes via parametric programming , 2006, IEEE Transactions on Biomedical Engineering.

[68]  L. Magni,et al.  First Use of Model Predictive Control in Outpatient Wearable Artificial Pancreas , 2014, Diabetes Care.

[69]  M W Percival,et al.  Development of a multi-parametric model predictive control algorithm for insulin delivery in type 1 diabetes mellitus using clinical parameters. , 2011, Journal of process control.

[70]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.