Reinforcement Learning Based on Multi-agent in RoboCup

Multi-agent systems form a particular type of distributed artificial intelligence systems. As an important character of players in game, autonomous agents' learning has become the main direction of researchers. In this paper, based on basic reinforcement learning, multi-agent reinforcement learning with specific context is proposed. The method is applied to RoboCup to learn coordination among agents. In the learning, the game field is divided into different areas, and the action choice is made dependent on the area in which the ball is currently located. This makes the state space and the action space decrease. After learning the optimal joint policy is determined. Comparison experiment between stochastic policy and this optimal policy shows the effectiveness of our approach.

[1]  Reda Alhajj,et al.  Multiagent reinforcement learning using function approximation , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[2]  Xian-Yi Cheng,et al.  Reinforcement learning in simulation RoboCup soccer , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Michael Wooldridge,et al.  Introduction to multiagent systems , 2001 .

[5]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[6]  Hiroaki Kitano,et al.  RoboCup: robot world cup , 1998, IEEE Robotics Autom. Mag..

[7]  Han Xue-dong Application of Reinforcement Learning to Robot Soccer , 2002 .

[8]  Guo-quan Wang,et al.  Multi-agent reinforcement learning: an approach based on agents' cooperation for a common goal , 2004, 8th International Conference on Computer Supported Cooperative Work in Design.

[9]  Hiroaki Kitano,et al.  The RoboCup Synthetic Agent Challenge 97 , 1997, IJCAI.

[10]  Jelle R. Kok,et al.  The Incremental Development of a Synthetic Multi-Agent System: The UvA Trilearn 2001 Robotic Soccer Simulation Team , 2002 .