Multi-agent path planning in unknown environment with reinforcement learning and neural network

Path planning of multi-agent is much harder than single-agent. Reinforcement learning (RL) is a popular method for it. However, it cannot solve the path planning problem directly in unknown environment. In this paper, neural network (NN) is applied to estimate the unvisited space. The traditional multi-agent reinforcement learning is modified by the neural approximation. The path planning of this paper includes two stages: we first use RL to generate training samples for NN; then the trained NN gives an approximate action to agents. The advantage of this method is we do not need to repeat RL for the unvisited state. Experiment results show the proposed algorithm can generate suboptimal paths in the unknown environment for multiple agents.

[1]  Vijay Kumar,et al.  Multi-agent path planning with multiple tasks and distance constraints , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Zixing Cai,et al.  Cooperative Coevolutionary Adaptive Genetic Algorithm in Path Planning of Cooperative Multi-Mobile Robot Systems , 2002, J. Intell. Robotic Syst..

[3]  Adi Botea,et al.  MAPP: a Scalable Multi-Agent Path Planning Algorithm with Tractability and Completeness Guarantees , 2011, J. Artif. Intell. Res..

[4]  Baher Abdulhai,et al.  Emotional temporal difference Q-learning signals in multi-agent system cooperation: real case studies , 2013 .

[5]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  Jonathan P. How,et al.  Decentralized path planning for multi-agent teams in complex environments using rapidly-exploring random trees , 2011, 2011 IEEE International Conference on Robotics and Automation.

[7]  Beom-Hee Lee,et al.  A KNOWLEDGE BASE FOR DYNAMIC PATH PLANNING OF MULTI-AGENTS , 2005 .

[8]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[9]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[10]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[13]  Barna Iantovics,et al.  Emerging Markets Queries in Finance and Business Artificial Intelligence in the path planning optimization of mobile agent navigation , 2013 .

[14]  Reda Alhajj,et al.  Modular fuzzy-reinforcement learning approach with internal model capabilities for multiagent systems , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Jae-Bok Song,et al.  Path Planning for a Robot Manipulator based on Probabilistic Roadmap and Reinforcement Learning , 2007 .

[16]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[17]  Velappa Ganapathy,et al.  Neural Q-Learning controller for mobile robot , 2009, 2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics.

[18]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[19]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[20]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[21]  Simon X. Yang,et al.  Neural-Network-Based Path Planning for a Multirobot System With Moving Obstacles , 2009, IEEE Trans. Syst. Man Cybern. Part C.