Multi-agent Learning Dynamics: A Survey

In this paper we compare state-of-the-art multi-agent reinforcement learning algorithms in a wide variety of games. We consider two types of algorithms: value iteration and policy iteration. Four characteristics are studied: initial conditions, parameter settings, convergence speed, and local versus global convergence. Global convergence is still difficult to achieve in practice, despite existing theoretical guarantees. Multiple visualizations are included to provide a comprehensive insight into the learning dynamics.

[1]  Daniel Kudenko,et al.  Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.

[2]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[3]  Simon Parsons,et al.  What evolutionary game theory tells us about multiagent learning , 2007, Artif. Intell..

[4]  Yoav Shoham,et al.  If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..

[5]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[6]  D. Serra,et al.  Game theory and economics , 2003 .

[7]  Robert Gibbons,et al.  A primer in game theory , 1992 .

[8]  Karl Tuyls,et al.  Theoretical advantages of lenient Q-learners: an evolutionary game theoretic perspective , 2007, AAMAS '07.

[9]  P. S. Sastry,et al.  Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Karl Tuyls,et al.  An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.

[11]  Mandayam A. L. Thathachar,et al.  Learning the global maximum with parameterized learning automata , 1995, IEEE Trans. Neural Networks.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Ann Nowé,et al.  Exploring selfish reinforcement learning in repeated games with stochastic rewards , 2007, Autonomous Agents and Multi-Agent Systems.

[14]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[15]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[16]  Fernando Redondo Game Theory and Economics , 2001 .

[17]  Peter Stone,et al.  Multiagent learning is not the answer. It is the question , 2007, Artif. Intell..

[18]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[19]  Daniel Kudenko,et al.  Reinforcement Learning Approaches to Coordination in Cooperative Multi-agent Systems , 2002, Adaptive Agents and Multi-Agents Systems.

[20]  Sean Luke,et al.  Lenience towards Teammates Helps in Cooperative Multiagent Learning , 2005 .

[21]  Maarten Peeters,et al.  Learning Automata as a Basis for Multi Agent Reinforcement Learning , 2005, EUMAS.

[22]  Karl Tuyls,et al.  An Overview of Cooperative and Competitive Multiagent Learning , 2005, LAMAS.

[23]  Sandip Sen,et al.  Learning and Adaption in Multi-Agent Systems , 2006 .

[24]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.