Reinforcement Learning for Optimization of COVID-19 Mitigation policies

The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history. As a result, governments around the world are faced with the challenge of protecting public health, while keeping the economy running to the greatest extent possible. Epidemiological models provide insight into the spread of these types of diseases and predict the effects of possible intervention policies. However, to date,the even the most data-driven intervention policies rely on heuristics. In this paper, we study how reinforcement learning (RL) can be used to optimize mitigation policies that minimize the economic impact without overwhelming the hospital capacity. Our main contributions are (1) a novel agent-based pandemic simulator which, unlike traditional models, is able to model fine-grained interactions among people at specific locations in a community; and (2) an RL-based methodology for optimizing fine-grained mitigation policies within this simulator. Our results validate both the overall simulator behavior and the learned policies under realistic conditions.

[1]  Wing Yin Venus Lau,et al.  Transmission interval estimates suggest pre-symptomatic spread of COVID-19 , 2020, medRxiv.

[2]  Changliu Liu,et al.  A Microscopic Epidemic Model and Pandemic Prediction Using Multi-Agent Reinforcement Learning , 2020, ArXiv.

[3]  C Jessica E Metcalf,et al.  Opportunities and challenges in modeling emerging infectious diseases , 2017, Science.

[4]  A. Vespignani,et al.  Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside Hubei province, China: a descriptive and modelling study , 2020, The Lancet Infectious Diseases.

[5]  Mofeng Yang,et al.  Modeling indoor-level non-pharmaceutical interventions during the COVID-19 pandemic: A pedestrian dynamics-based microscopic simulation approach , 2020, Transport Policy.

[6]  ThaiBinh Luong,et al.  Modeling Epidemics With Compartmental Models. , 2020, JAMA.

[7]  L. Meyers,et al.  When individual behaviour matters: homogeneous and network models in epidemiology , 2007, Journal of The Royal Society Interface.

[8]  Samuel V. Scarpino,et al.  Modelling the trajectory of disease outbreaks works , 2018, Nature.

[9]  Yiu Chung Lau,et al.  Temporal dynamics in viral shedding and transmissibility of COVID-19 , 2020, Nature Medicine.

[10]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[11]  A. Vespignani,et al.  Modelling the impact of testing, contact tracing and household quarantine on second waves of COVID-19 , 2020, Nature Human Behaviour.

[12]  Kari Stefansson,et al.  Spread of SARS-CoV-2 in the Icelandic Population , 2020, The New England journal of medicine.

[13]  Susan M. Mniszewski,et al.  Modeling the Impact of Behavior Changes on the Spread of Pandemic Influenza , 2012, Modeling the Interplay Between Human Behavior and the Spread of Infectious Diseases.

[14]  M. Olfson,et al.  A stochastic agent-based model of the SARS-CoV-2 epidemic in France , 2020, Nature Medicine.

[15]  Alessandro Vespignani,et al.  Measurability of the epidemic reproduction number in data-driven contact networks , 2018, Proceedings of the National Academy of Sciences.

[16]  Sarah Cobey,et al.  Modeling infectious disease dynamics , 2020, Science.

[17]  Milind Tambe,et al.  Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance , 2020, medRxiv : the preprint server for health sciences.

[18]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Philippe Lemey,et al.  Deep reinforcement learning for large-scale epidemic control , 2020, ECML/PKDD.

[21]  L. Meyers,et al.  COVID-19: How to Relax Social Distancing If You Must , 2020, medRxiv.

[22]  Shawn T. Brown,et al.  FRED (A Framework for Reconstructing Epidemic Dynamics): an open-source software system for modeling infectious diseases and control strategies using census-based populations , 2013, BMC Public Health.

[23]  Yang Yu,et al.  Reinforced Epidemic Control: Saving Both Lives and Economy , 2020, ArXiv.

[24]  Harshad Khadilkar,et al.  Optimising Lockdown Policies for Epidemic Control using Reinforcement Learning , 2020, Transactions of the Indian National Academy of Engineering.

[25]  C. Whittaker,et al.  Estimates of the severity of coronavirus disease 2019: a model-based analysis , 2020, The Lancet Infectious Diseases.

[26]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.