Path planning in unknown environment with kernel smoothing and reinforcement learning for multi-agent systems

In unknown environment, path planning of multiagent systems is difficult. The popular methods for the path planning, such as reinforcement learning (RL), do not work for these two cases: unknown environment and multi-agent. In this paper, we use a special intelligent method, kernel smoothing, to estimate the unknown environment, and combine it with the reinforcement learning technique. The advantage of the combination of the reinforcement learning and the kernel smoothing technique is we do not need to repeat RL for the unvisited state. The path planning process has three stages: 1) the reinforcement learning is applied to generate the training samples; 2) the model is trained by the kernel smoothing method; 3) the trained model gives an approximate action to agents. Experiment results show the proposed algorithm can generate desired paths in the unknown environment for multiple agents.

[1]  Beom-Hee Lee,et al.  A KNOWLEDGE BASE FOR DYNAMIC PATH PLANNING OF MULTI-AGENTS , 2005 .

[2]  Jeongsik Choi,et al.  A Computational Interactive Approach to Multi-agent Motion Planning , 2007 .

[3]  Velappa Ganapathy,et al.  Neural Q-Learning controller for mobile robot , 2009, 2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics.

[4]  E. Nadaraya On Estimating Regression , 1964 .

[5]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[6]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[8]  Jinzhi Wang,et al.  Finite-gain Lp consensus of multi-agent systems , 2013 .

[9]  Simon X. Yang,et al.  Neural-Network-Based Path Planning for a Multirobot System With Moving Obstacles , 2009, IEEE Trans. Syst. Man Cybern. Part C.

[10]  Jonathan P. How,et al.  Decentralized path planning for multi-agent teams in complex environments using rapidly-exploring random trees , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  M. Priestley,et al.  Non‐Parametric Function Fitting , 1972 .

[12]  Y. Jia,et al.  Formation control of discrete-time multi-agent systems by iterative learning approach , 2012 .

[13]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[14]  Barbara Messing,et al.  An Introduction to MultiAgent Systems , 2002, Künstliche Intell..

[15]  Jinzhi Wang,et al.  Adaptive consensus tracking of high-order nonlinear multi-agent systems with directed communication graphs , 2014 .

[16]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[17]  Barna Iantovics,et al.  Emerging Markets Queries in Finance and Business Artificial Intelligence in the path planning optimization of mobile agent navigation , 2013 .

[18]  Reda Alhajj,et al.  Modular fuzzy-reinforcement learning approach with internal model capabilities for multiagent systems , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[20]  Vijay Kumar,et al.  Multi-agent path planning with multiple tasks and distance constraints , 2010, 2010 IEEE International Conference on Robotics and Automation.

[21]  Zixing Cai,et al.  Cooperative Coevolutionary Adaptive Genetic Algorithm in Path Planning of Cooperative Multi-Mobile Robot Systems , 2002, J. Intell. Robotic Syst..

[22]  Baher Abdulhai,et al.  Emotional temporal difference Q-learning signals in multi-agent system cooperation: real case studies , 2013 .

[23]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[24]  Adi Botea,et al.  MAPP: a Scalable Multi-Agent Path Planning Algorithm with Tractability and Completeness Guarantees , 2011, J. Artif. Intell. Res..

[25]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[26]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[27]  G. Xie,et al.  Second-order consensus of multi-agent systems with unknown but bounded disturbance , 2013, International Journal of Control, Automation and Systems.

[28]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[29]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[30]  Yuzhen Wang,et al.  Consensus of linear multi-agent systems subject to actuator saturation , 2013 .

[31]  Jae-Bok Song,et al.  Path Planning for a Robot Manipulator based on Probabilistic Roadmap and Reinforcement Learning , 2007 .