Extracting Clinician's Goals by What-if Interpretable Modeling

Although reinforcement learning (RL) has tremendous success in many fields, applying RL to real-world settings such as healthcare is challenging when the reward is hard to specify and no exploration is allowed. In this work, we focus on recovering clinicians’ rewards in treating patients. We incorporate the what-if reasoning to explain clinician’s actions based on future outcomes. We use generalized additive models (GAMs) a class of accurate, interpretable models to recover the reward. In both simulation and a real-world hospital dataset, we show our model outperforms baselines. Finally, our model’s explanations match several clinical guidelines when treating patients while we found the previously-used linear model often contradicts them.

[1]  Michael C. Yip,et al.  Adversarial Imitation via Variational Inverse Reinforcement Learning , 2018, ICLR.

[2]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[3]  Rich Caruana,et al.  How Interpretable and Trustworthy are GAMs? , 2020, KDD.

[4]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[5]  Sarah Tan,et al.  Learning Global Additive Explanations of Black-Box Models , 2019 .

[6]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[7]  Yongxin Yang,et al.  Deep Neural Decision Trees , 2018, ArXiv.

[8]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[9]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[10]  Finale Doshi-Velez,et al.  POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning , 2020, AISTATS.

[11]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[12]  Suchi Saria,et al.  Reliable Decision Support using Counterfactual Models , 2017, NIPS.

[13]  Bryan Lim,et al.  Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks , 2018, NeurIPS.

[14]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[15]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[16]  Srivatsan Srinivasan,et al.  Interpretable Batch IRL to Extract Clinician Goals in ICU Hypotension Management. , 2020, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[17]  Richard Zemel,et al.  A Divergence Minimization Perspective on Imitation Learning Methods , 2019, CoRL.

[18]  David Sontag,et al.  Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models , 2019, ICML.

[19]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[20]  Paolo Favaro,et al.  On Stabilizing Generative Adversarial Training With Noise , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[22]  Rich Caruana,et al.  Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation , 2017, AIES.

[23]  Rich Caruana,et al.  NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning , 2021, ArXiv.

[24]  Rich Caruana,et al.  InterpretML: A Unified Framework for Machine Learning Interpretability , 2019, ArXiv.

[25]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[26]  Sergey Levine,et al.  Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.

[27]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[28]  Srivatsan Srinivasan,et al.  Truly Batch Apprenticeship Learning with Deep Successor Features , 2019, IJCAI.

[29]  Mihaela van der Schaar,et al.  Learning "What-if" Explanations for Sequential Decision-Making , 2021, ICLR.

[30]  Alan E Jones,et al.  SEVERITY OF EMERGENCY DEPARTMENT HYPOTENSION PREDICTS ADVERSE HOSPITAL OUTCOME , 2004, Shock.

[31]  Ashish K. Khanna,et al.  Defending a mean arterial pressure in the intensive care unit: Are we there yet? , 2018, Annals of Intensive Care.