Improved Multi-Agent Reinforcement Learning for Path Planning-Based Crowd Simulation

The combination of multi-agent technology and reinforcement learning methods has been recognized as an effective way which is used in path planning-based crowd simulation. However, the existing solution is still not satisfactory due to the problem in the mutual influence of agents. Therefore, an improved multi-agent reinforcement learning method (IMARL algorithm) is introduced. In this method, the intersection of the pedestrian trajectory extracted from the real video is first used as the state space for reinforcement learning. The crowd is grouped and the leader is selected. A bulletin board is added to the reinforcement learning algorithm of multi-agent to store the empirical knowledge of the learning process, and the navigation agent passes information between the leader and the bulletin board. The original social force model was improved, and the cohesive force of visual factors was added to the force formula. The IMARL algorithm is combined with the improved social force model for crowd evacuation simulation. Using a two-layer control mechanism, the leader in the upper layer uses the decision process based on the IMARL algorithm to select the path, and the individuals in the bottom group use the improved social force model to evacuate. The method of this paper not only solves the dimensionality disaster problem of reinforcement learning but also improves the convergence speed. The evacuation efficiency is effectively improved in crowd evacuation simulation experiments. In addition, it can also provide specific guidance scheme for crowd evacuation improvement and assistant decision support for the prevention and management of large-scale group trampling incidents.

[1]  Yang Gao,et al.  Multiagent Reinforcement Learning With Sparse Interactions by Negotiation and Knowledge Transfer , 2015, IEEE Transactions on Cybernetics.

[2]  Bing-Hong Wang,et al.  A social force evacuation model with the leadership effect , 2014 .

[3]  Hong Liu,et al.  Strategies to Utilize the Positive Emotional Contagion Optimally in Crowd Evacuation , 2020, IEEE Transactions on Affective Computing.

[4]  Yi Hou,et al.  Crowd evacuation simulation based on a hierarchy environmental model , 2009, 2009 IEEE 10th International Conference on Computer-Aided Industrial Design & Conceptual Design.

[5]  Xiaolin Hu,et al.  Modeling group structures in pedestrian crowd simulation , 2010, Simul. Model. Pract. Theory.

[6]  Hong Liu,et al.  A social force evacuation model driven by video data , 2018, Simul. Model. Pract. Theory.

[7]  Yuan Dan,et al.  Modeling the Evacuation Behavior Considering the Effect of Dangerous Source , 2014 .

[8]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[9]  Jose J. Gonzalez,et al.  A Spatio-temporal Probabilistic Model of Hazard and Crowd Dynamics in Disasters for Evacuation Planning , 2013, IEA/AIE.

[10]  Hong Liu,et al.  A path planning approach for crowd evacuation in buildings based on improved artificial bee colony algorithm , 2018, Appl. Soft Comput..

[11]  Siome Goldenstein,et al.  Crowd simulation: applying mobile grids to the social force model , 2012, The Visual Computer.

[12]  Lubos Buzna,et al.  Self-Organized Pedestrian Crowd Dynamics: Experiments, Simulations, and Design Solutions , 2005, Transp. Sci..

[13]  Lizhong Yang,et al.  Exit dynamics of occupant evacuation in an emergency , 2006 .

[14]  Dirk Helbing,et al.  Simulating dynamical features of escape panic , 2000, Nature.

[15]  Guijuan Zhang,et al.  Knowledge-Based Crowd Motion for the Unfamiliar Environment , 2018, IEEE Access.

[16]  Adrien Treuille,et al.  Continuum crowds , 2006, SIGGRAPH 2006.

[17]  Fernando Fernández,et al.  Multi-agent Reinforcement Learning for Simulating Pedestrian Navigation , 2011, ALA.

[18]  Hairong Dong,et al.  Modeling and simulation of pedestrian dynamical behavior based on a fuzzy logic approach , 2016, Inf. Sci..

[19]  Hong Liu,et al.  A novel approach to task assignment in a cooperative multi-agent design system , 2015, Applied Intelligence.

[20]  Katsuhiro Nishinari,et al.  Simulation for pedestrian dynamics by real-coded cellular automata (RCA) , 2007 .

[21]  Jean-Daniel Zucker,et al.  Reinforcement learning approach for adapting complex agent-based model of evacuation to fast linear model , 2017, 2017 Seventh International Conference on Information Science and Technology (ICIST).

[22]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.

[23]  Dongpu Cao,et al.  Reinforcement Learning Optimized Look-Ahead Energy Management of a Parallel Hybrid Electric Vehicle , 2017, IEEE/ASME Transactions on Mechatronics.

[24]  Yujing Hu,et al.  Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer , 2015, IEEE Transactions on Cybernetics.

[25]  Hairong Dong,et al.  Optimization of Crowd Evacuation With Leaders in Urban Rail Transit Stations , 2019, IEEE Transactions on Intelligent Transportation Systems.

[26]  Saeid Nahavandi,et al.  System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: A Survey , 2017, IEEE Access.

[27]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[28]  Hong Liu,et al.  Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism , 2018, Inf. Sci..

[29]  Hong Liu,et al.  Extended route choice model based on available evacuation route set and its application in crowd evacuation simulation , 2017, Simul. Model. Pract. Theory.

[30]  Dinesh Manocha,et al.  Interactive and adaptive data-driven crowd simulation , 2016, 2016 IEEE Virtual Reality (VR).

[31]  Vittaldas V. Prabhu,et al.  Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems , 2004, Applied Intelligence.

[32]  Zhang Hao,et al.  Modified two-layer social force model for emergency earthquake evacuation , 2018 .

[33]  Hong Liu,et al.  A Multi-Species Artificial Bee Colony Algorithm and Its Application for Crowd Simulation , 2019, IEEE Access.

[34]  Wolfram Burgard,et al.  Socially compliant mobile robot navigation via inverse reinforcement learning , 2016, Int. J. Robotics Res..

[35]  Hong Liu,et al.  A collective motion model based on two-layer relationship mechanism for bi-direction pedestrian flow simulation , 2018, Simul. Model. Pract. Theory.

[36]  Junwei Gao,et al.  FMRQ—A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks , 2017, IEEE Transactions on Cybernetics.

[37]  Kok-Lim Alvin Yau,et al.  Route Selection for Multi-Hop Cognitive Radio Networks Using Reinforcement Learning: An Experimental Study , 2016, IEEE Access.

[38]  Hairong Dong,et al.  Guided crowd dynamics via modified social force model , 2014 .

[39]  Tim Clarke,et al.  Heuristically Accelerated Reinforcement Learning for Dynamic Secondary Spectrum Sharing , 2015, IEEE Access.

[40]  Emilio Luque,et al.  A Hybrid Simulation Model to Test Behaviour Designs in an Emergency Evacuation , 2012, ICCS.