Multi-robot Cooperation Based on Continuous Reinforcement Learning with Two State Space Representations

The field of multi-robot systems (MRS) deals with groups of autonomous robots and is attracting much research interest from robotics researchers. MRSs are expected to achieve tasks that are difficult to perform by an individual robot. In MRS, reinforcement learning (RL) is a promising approach for the distributed control of each robot. RL allows robots to learn mappings from their states to their actions through rewards or payoffs obtained through interactions with their environment. Theoretically, the environment of an MRS is nonstationary, and therefore rewards or payoffs received by learning robots depend not only on their own actions but also on the actions of other robots. From this perspective, our research group has been developing a technique that segments state and action spaces simultaneously and autonomously to extend the adaptability of an MRS to dynamic environments. To improve learning performance, we introduce a mechanism of selecting either of the two state spaces: one is represented by a parametric model for exploration, while the other is represented by a nonparametric model for exploitation. The proposed technique is expected to show better learning performance and robustness against an unpredicted environmental change. We investigate our proposed technique by conducting computer simulations of a cooperative box-pushing task with six autonomous mobile robots.

[1]  Olivier Buffet,et al.  Shaping multi-agent systems with gradient reinforcement learning , 2007, Autonomous Agents and Multi-Agent Systems.

[2]  Yong Duan,et al.  A multi-agent reinforcement learning approach to robot soccer , 2012, Artificial Intelligence Review.

[3]  Martin A. Riedmiller,et al.  Reinforcement learning for robot soccer , 2009, Auton. Robots.

[4]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[5]  Peter Stone,et al.  Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[6]  Minoru Asada,et al.  Emulation and behavior understanding through shared values , 2010, Robotics Auton. Syst..

[7]  Kazuhiro Ohkura,et al.  Preservation and Application of Acquired Knowledge Using Instance-Based Reinforcement Learning for Multi-Robot Systems , 2011, J. Adv. Comput. Intell. Intell. Informatics.

[8]  Kazuhiro Ohkura,et al.  Autonomous Role Assignment in Homogeneous Multi-Robot Systems(Multi-agent and Learning,Session: TP2-A) , 2004 .

[9]  Kazuhiro Ohkura,et al.  Improving the Robustness of Instance-based Reinforcement Learning Robots by Metalearning , 2010, SOCO 2010.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Yukinori Kakazu,et al.  An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning , 2003, Robotics Auton. Syst..

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Yan Meng,et al.  Distributed Reinforcement Learning for Coordinate Multi-Robot Foraging , 2010, J. Intell. Robotic Syst..

[14]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[15]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .