Cooperative Multi-agent Learning in a Large Dynamic Environment

In this work, we are addressing the problem of cooperative multi-agent learning for distributed decision making in non stationary environments. Our principal focus is to improve learning by exchanging information between local neighbors (agents) and to ensure the adaption to the new environmental form without ignoring knowledge already acquired. First, a distributed dynamic correlation matrix based on multi-Q learning method, presented in [1], is evaluated. To overcome the shortcomings of this method, a new multi-agent reinforcement learning approach and a new cooperative action selection strategy are developed. Several simulation tests are conducted using a cooperative foraging task with a single moving target and show the efficiency of the proposed methods in the case of large, unknown and temporary dynamic environments.

[1]  Zhongzhi Shi,et al.  Adaptive action selection using utility-based reinforcement learning , 2009, 2009 IEEE International Conference on Granular Computing.

[2]  Maja J. Mataric,et al.  Learning in Multi-Robot Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Sean Luke,et al.  A pheromone-based utility model for collaborative foraging , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[5]  Yan Meng,et al.  Dynamic correlation matrix based multi-Q learning for a multi-robot system , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Peter Stone,et al.  Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[7]  Sean Luke,et al.  Collaborative foraging using beacons , 2010, AAMAS.

[8]  S. G. Ponnambalam,et al.  Reinforcement learning: exploration–exploitation dilemma in multi-agent foraging task , 2012 .

[9]  Yan Meng,et al.  Distributed Reinforcement Learning for Coordinate Multi-Robot Foraging , 2010, J. Intell. Robotic Syst..

[10]  Yifan Cai,et al.  Intelligent Multi-robot Cooperation for Target Searching and Foraging Tasks in Completely Unknown Environments , 2013 .

[11]  Mohammad A. Jaradat,et al.  Reinforcement based mobile robot navigation in dynamic environment , 2011 .

[12]  Matthew E. Taylor,et al.  Towards student/teacher learning in sequential decision tasks , 2012, AAMAS.

[13]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[14]  Moncef Tagina,et al.  A Novel Exploration/Exploitation Policy Accelerating Learning in Both Stationary and Non-Stationary Environment Navigation Tasks , 2015 .

[15]  Yong Cao,et al.  Non-reciprocating Sharing Methods in Cooperative Q-Learning Environments , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.