A Deep Reinforcement Learning-Enabled Dynamic Redeployment System for Mobile Ambulances

Protecting citizens' lives from emergent accidents (e.g. traffic accidents) and diseases (e.g. heart attack) is of vital importance in urban computing. Every day many people are caught in emergent accidents or diseases and thus need ambulances to transport them to hospitals. In this paper, we propose a dynamic ambulance redeployment system to reduce the time needed for ambulances to pick up patients and to increase the probability of patients being saved in time. For patients in danger, every second counts. Specifically, whenever there is an ambulance becoming available (e.g. finishing transporting a patient to a hospital), our dynamic ambulance redeployment system will redeploy it to a proper ambulance station such that it can better pick up future patients. However, the dynamic ambulance redeployment is challenging, as when we redeploy an available ambulance we need to simultaneously consider each station's multiple dynamic factors. To trade off these multiple factors using handcrafted rules are almost impossible. To deal with this issue, we propose using a deep neural network, called deep score network, to balance each station's dynamic factors into one score, leveraging the excellent representation ability of deep neural networks. And then we propose a deep reinforcement learning framework to learn the deep score network. Finally, based on the learned deep score network, we provide an effective dynamic ambulance redeployment algorithm. Experiment results using data collected in real world show clear advantages of our method over baselines, e.g. comparing with baselines, our method can save ~100 seconds (~20%) of average pickup time of patients and improve the ratio of patients being picked up within 10 minutes from 0.786 to 0.838. With our method, people in danger can be better saved.

[1]  Lu Zhen,et al.  A simulation optimization framework for ambulance deployment and relocation problems , 2014, Comput. Ind. Eng..

[2]  Pieter L. van den Berg,et al.  Comparison of static ambulance location models , 2016, 2016 3rd International Conference on Logistics Operations Management (GOL).

[3]  Yu Zheng,et al.  Effective and Efficient: Large-Scale Dynamic City Express , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  Sandjai Bhulai,et al.  An efficient heuristic for real-time ambulance redeployment , 2015 .

[5]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[6]  Hoong Chuin Lau,et al.  Risk Based Optimization for Improving Emergency Medical Systems , 2015, AAAI.

[7]  Albert Y. Chen,et al.  Demand Forecast Using Data Analytics for the Preallocation of Ambulances , 2016, IEEE Journal of Biomedical and Health Informatics.

[8]  Sandjai Bhulai,et al.  Optimal Ambulance Dispatching , 2017 .

[9]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[10]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[11]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[12]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[13]  Ramayya Krishnan,et al.  An Efficient Simulation-Based Approach to Ambulance Fleet Allocation and Dynamic Redeployment , 2012, AAAI.

[14]  Michel Gendreau,et al.  A dynamic model and parallel tabu search heuristic for real-time ambulance relocation , 2001, Parallel Comput..

[15]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[16]  Yu Zheng,et al.  Real-Time City-Scale Taxi Ridesharing , 2015, IEEE Transactions on Knowledge and Data Engineering.

[17]  Guannan Liu,et al.  A cost-effective recommender system for taxi drivers , 2014, KDD.

[18]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[19]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[20]  Christopher M. Bishop,et al.  Pattern recognition and machine learning, 5th Edition , 2007, Information science and statistics.

[21]  R. H. Byers Half‐Normal Distribution , 2005 .

[22]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[23]  Xing Xie,et al.  T-drive: driving directions based on taxi trajectories , 2010, GIS '10.

[24]  Shane G. Henderson,et al.  European Journal of Operational Research Large-network Travel Time Distribution Estimation for Ambulances , 2022 .

[25]  Graham Coates,et al.  A simulation model to enable the optimization of ambulance fleet allocation and base station location for increased patient survival , 2015, Eur. J. Oper. Res..

[26]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[27]  Lara Wiesche,et al.  Time-dependent ambulance allocation considering data-driven empirically required coverage , 2015, Health care management science.

[28]  David S. Matteson,et al.  Predicting Ambulance Demand: a Spatio-Temporal Kernel Approach , 2015, KDD.

[29]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[30]  Lawrence V. Snyder,et al.  Reliability Models for Facility Location: The Expected Failure Cost Case , 2005, Transp. Sci..

[31]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[32]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[33]  Mark S. Daskin,et al.  A Maximum Expected Covering Location Model: Formulation, Properties and Heuristic Solution , 1983 .

[34]  Verena Schmid,et al.  Solving the dynamic ambulance relocation and dispatching problem using approximate dynamic programming , 2012, Eur. J. Oper. Res..

[35]  Matthew S. Maxwell,et al.  Approximate Dynamic Programming for Ambulance Redeployment , 2010, INFORMS J. Comput..

[36]  Karl F. Doerner,et al.  Ambulance location and relocation problems with time-dependent travel times , 2010, Eur. J. Oper. Res..

[37]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[38]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[39]  Yu Zheng,et al.  Travel time estimation of a path using sparse trajectories , 2014, KDD.

[40]  Wei Cao,et al.  When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks , 2018, AAAI.

[41]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[42]  Richard L. Church,et al.  The maximal covering location problem , 1974 .