Coordination in multiagent reinforcement learning systems by virtual reinforcement signals

This paper presents a novel method for on-line coordination in multiagent reinforcement learning systems. In this method a reinforcement-learning agent learns to select its action estimating system dynamics in terms of both the natural reward for task achievement and the virtual reward for cooperation. The virtual reward for cooperation is ascertained dynamically by a coordinating agent who estimates it from the change in degree of cooperation of all agents using a separate reinforcement learning. This technique provides adaptive coordination, requires less communication and ensures agents to be cooperative. The validity of virtual rewards for convergence in learning is verified, and the proposed method is tested on two different simulated domains to illustrate its significance. The empirical performance of the coordinated system compared to the uncoordinated system illustrates its advantages for multiagent systems.

[1]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[2]  Risto Miikkulainen,et al.  Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..

[3]  Andrew G. Barto,et al.  Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.

[4]  Junichi Murata,et al.  Coordination in Multiagent Reinforcement Learning Systems , 2004, KES.

[5]  K. Hirasawa,et al.  Task-oriented reinforcement learning for continuous tasks in dynamic environment , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[6]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[7]  Petra Funk,et al.  Multiagentsystems - A Modern Approach to Distributed Artificial Intelligence , 2000, Künstliche Intell..

[8]  Kagan Tumer,et al.  An Introduction to Collective Intelligence , 1999, ArXiv.

[9]  Satinder Singh,et al.  Learning to Solve Markovian Decision Processes , 1993 .

[10]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  John W. Sheppard Multi-agent reinforcement learning in Markov games , 1997 .

[13]  John J. Grefenstette,et al.  The Evolution of Strategies for Multiagent Environments , 1992, Adapt. Behav..

[14]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[15]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[16]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[17]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[18]  Mark S. Fox,et al.  An Organizational View of Distributed Systems , 1988, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[20]  Victor R. Lesser,et al.  Learning Situation-Specific Coordination in Cooperative Multi-agent Systems , 1999, Autonomous Agents and Multi-Agent Systems.

[21]  Victor R. Lesser,et al.  Designing a Family of Coordination Algorithms , 1997, ICMAS.

[22]  J. Grefenstette The Evolution of Strategies for Multi-agent Environments , 1987 .

[23]  Steven D. Whitehead,et al.  A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning , 1991, AAAI.

[24]  Leonidas J. Guibas,et al.  A Visibility-Based Pursuit-Evasion Problem , 1999, Int. J. Comput. Geom. Appl..

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  Reda Alhajj,et al.  Multiagent reinforcement learning using function approximation , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[27]  Victor R. Lesser,et al.  Learning to Improve Coordinated Actions in Cooperative Distributed Problem-Solving Environments , 1998, Machine Learning.