The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care

Sepsis is the third leading cause of death worldwide and the main cause of mortality in hospitals1–3, but the best treatment strategy remains uncertain. In particular, evidence suggests that current practices in the administration of intravenous fluids and vasopressors are suboptimal and likely induce harm in a proportion of patients1,4–6. To tackle this sequential decision-making problem, we developed a reinforcement learning agent, the Artificial Intelligence (AI) Clinician, which extracted implicit knowledge from an amount of patient data that exceeds by many-fold the life-time experience of human clinicians and learned optimal treatment by analyzing a myriad of (mostly suboptimal) treatment decisions. We demonstrate that the value of the AI Clinician’s selected treatment is on average reliably higher than human clinicians. In a large validation cohort independent of the training data, mortality was lowest in patients for whom clinicians’ actual doses matched the AI decisions. Our model provides individualized and clinically interpretable treatment decisions for sepsis that could improve patient outcomes.A reinforcement learning agent, the AI Clinician, can assist physicians by providing individualized and clinically interpretable treatment decisions to improve patient outcomes.

[1]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[2]  C. Steiner,et al.  Comorbidity measures for use with administrative data. , 1998, Medical care.

[3]  Doina Precup,et al.  Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.

[4]  Andrew J. Schaefer,et al.  Modeling Medical Treatment Using Markov Decision Processes , 2005 .

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  G.G. Yin,et al.  Discrete-Time Markov Chains , 2006, IEEE Transactions on Automatic Control.

[7]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[8]  Caleb W. Hug,et al.  Detecting hazardous intensive care patient episodes using real-time mortality models , 2009 .

[9]  Richard H. Jones,et al.  Bayesian information criterion for longitudinal and clustered data , 2011, Statistics in medicine.

[10]  Andrew Rhodes,et al.  Drotrecogin alfa (activated) in adults with septic shock. , 2012, The New England journal of medicine.

[11]  C. Torio,et al.  National Inpatient Hospital Costs: The Most Expensive Conditions by Payer, 2011 , 2013 .

[12]  Samuel M Brown,et al.  Survival after shock requiring high-dose vasopressor therapy. , 2013, Chest.

[13]  A. Aldo Faisal,et al.  The use of reinforcement learning algorithms to meet the challenges of an artificial pancreas , 2013, Expert review of medical devices.

[14]  Kris K. Hauser,et al.  Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach , 2013, Artif. Intell. Medicine.

[15]  A. Aldo Faisal,et al.  Towards efficient, personalized anesthesia using continuous reinforcement learning for propofol infusion control , 2013, 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER).

[16]  G. Escobar,et al.  Hospital deaths in patients with sepsis from 2 independent cohorts. , 2014, JAMA.

[17]  Wu Ji,et al.  Early versus delayed administration of norepinephrine in patients with septic shock , 2014, Critical Care.

[18]  Stephen E. Lapinsky,et al.  Interaction Between Fluids and Vasoactive Agents on Mortality in Septic Shock: A Multicenter, Observational Study* , 2014, Critical care medicine.

[19]  Gerhard Tutz,et al.  Improved methods for the imputation of missing data by nearest neighbor methods , 2015, Comput. Stat. Data Anal..

[20]  P. Marik,et al.  The demise of early goal‐directed therapy for severe sepsis and septic shock , 2015, Acta anaesthesiologica Scandinavica.

[21]  Philip S. Thomas,et al.  High Confidence Policy Improvement , 2015, ICML.

[22]  J. Vincent,et al.  A positive fluid balance is an independent prognostic factor in patients with sepsis , 2015, Critical Care.

[23]  Philip S. Thomas,et al.  High-Confidence Off-Policy Evaluation , 2015, AAAI.

[24]  Gavin D Perkins,et al.  Levosimendan for the Prevention of Acute Organ Dysfunction in Sepsis. , 2016, The New England journal of medicine.

[25]  Shamim Nemati,et al.  Machine Learning and Decision Support in Critical Care , 2016, Proceedings of the IEEE.

[26]  R. Bellomo,et al.  The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). , 2016, JAMA.

[27]  Nan Jiang,et al.  Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.

[28]  P. Marik,et al.  A rational approach to fluid therapy in sepsis. , 2016, British journal of anaesthesia.

[29]  P. Marik,et al.  Fluid administration in severe sepsis and septic shock, patterns and outcomes: an analysis of a large national database , 2017, Intensive Care Medicine.

[30]  M. Matthay,et al.  Sepsis: pathophysiology and clinical management , 2016, British Medical Journal.

[31]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[32]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[33]  T. Rea,et al.  Assessment of Clinical Criteria for Sepsis: For the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). , 2016, JAMA.

[34]  Marc G. Bellemare,et al.  Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.

[35]  J. Vincent The Future of Critical Care Medicine: Integration and Personalization , 2016, Critical care medicine.

[36]  Philip S. Thomas,et al.  Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.

[37]  R. Bellomo,et al.  Prognostic Accuracy of the SOFA Score, SIRS Criteria, and qSOFA Score for In-Hospital Mortality Among Adults With Suspected Infection Admitted to the Intensive Care Unit , 2017, JAMA.

[38]  Barbara E. Engelhardt,et al.  A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units , 2017, UAI.

[39]  Jonathan H. Chen,et al.  Machine Learning and Prediction in Medicine - Beyond the Peak of Inflated Expectations. , 2017, The New England journal of medicine.

[40]  F. V. van Haren,et al.  Fluid resuscitation in human sepsis: Time to rewrite history? , 2017, Annals of Intensive Care.

[41]  Peter Stone,et al.  Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation , 2016, AAAI.