A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units

The management of invasive mechanical ventilation, and the regulation of sedation and analgesia during ventilation, constitutes a major part of the care of patients admitted to intensive care units. Both prolonged dependence on mechanical ventilation and premature extubation are associated with increased risk of complications and higher hospital costs, but clinical opinion on the best protocol for weaning patients off of a ventilator varies. This work aims to develop a decision support tool that uses available patient information to predict time-to-extubation readiness and to recommend a personalized regime of sedation dosage and ventilator support. To this end, we use off-policy reinforcement learning algorithms to determine the best action at a given patient state from sub-optimal historical ICU data. We compare treatment policies from fitted Q-iteration with extremely randomized trees and with feedforward neural networks, and demonstrate that the policies learnt show promise in recommending weaning protocols with improved outcomes, in terms of minimizing rates of reintubation and regulating physiological stability.

[1]  D. R. Brush,et al.  Sedation and analgesia for the mechanically ventilated patient. , 2009, Clinics in chest medicine.

[2]  David H. Chong,et al.  ICU Occupancy and Mechanical Ventilator Use in the United States* , 2013, Critical care medicine.

[3]  Romesh Stanislaus,et al.  Can Machine Learning Methods Predict Extubation Outcome in Premature Infants as well as Clinicians? , 2013, Journal of neonatal biology.

[4]  Kai Li,et al.  Sparse Multi-Output Gaussian Processes for Medical Time Series Prediction , 2017 .

[5]  Liming Xiang,et al.  Kernel-Based Reinforcement Learning , 2006, ICIC.

[6]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[7]  J. Krinsley,et al.  What is the optimal rate of failed extubation? , 2012, Critical Care.

[8]  Louis Wehenkel,et al.  Clinical data based optimal STI strategies for HIV: a reinforcement learning approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[9]  G. S. Soo Hoo Blood gases, weaning, and extubation. , 2003, Respiratory care.

[10]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[11]  S. McGrane,et al.  Sedation in the intensive care setting , 2012, Clinical pharmacology : advances and applications.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Shamim Nemati,et al.  Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[14]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[15]  José David Martín-Guerrero,et al.  Optimization of anemia treatment in hemodialysis patients via reinforcement learning , 2014, Artif. Intell. Medicine.

[16]  J. Goldstone The pulmonary physician in critical care • 10: Difficult weaning , 2002, Thorax.

[17]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[18]  Hung-Wen Chiu,et al.  Improvement in the Prediction of Ventilator Weaning Outcomes by an Artificial Neural Network in a Medical ICU , 2015, Respiratory Care.

[19]  M. Kosorok,et al.  Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer , 2011, Biometrics.

[20]  P. Tonner,et al.  Sedation and weaning from mechanical ventilation: time for ‘best practice’ to catch up with new realities? , 2014, Multidisciplinary Respiratory Medicine.

[21]  Pierre Geurts,et al.  Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..

[22]  Larry D. Pyeatt,et al.  Intelligent Control of Closed-Loop Sedation in Simulated ICU Patients , 2004, FLAIRS.

[23]  Paul Jen-Hwa Hu,et al.  Incorporating association rule networks in feature category-weighted naive Bayes model to support weaning decision making , 2017, Decis. Support Syst..

[24]  David A. Clifton,et al.  Multitask Gaussian Processes for Multivariate Physiological Time-Series Analysis , 2015, IEEE Transactions on Biomedical Engineering.

[25]  Luke Howard,et al.  Key Points Educational Aims , 2022 .

[26]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[27]  Sergey Levine,et al.  Feature Construction for Inverse Reinforcement Learning , 2010, NIPS.

[28]  Peter Szolovits,et al.  A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse, Heterogeneous Clinical Data , 2015, AAAI.

[29]  Oliver Stegle,et al.  Gaussian Process Robust Regression for Noisy Heart Rate Data , 2008, IEEE Transactions on Biomedical Engineering.