Multi-agent Reinforcement Learning in Stochastic Single and Multi-stage Games

In this paper we report on a solution method for one of the most challenging problems in Multi-agent Reinforcement Learning, i.e. coordination. In previous work we reported on a new coordinated exploration technique for individual reinforcement learners, called Exploring Selfish Reinforcement Rearning (ESRL). With this technique, agents may exclude one or more actions from their private action space, so as to coordinate their exploration in a shrinking joint action space. Recently we adapted our solution mechanism to work in tree structured common interest multi-stage games. This paper is a roundup on the results for stochastic single and multi-stage common interest games.

[1]  Kagan Tumer,et al.  General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[2]  Ann Nowé,et al.  Social Agents Playing a Periodical Policy , 2001, ECML.

[3]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[4]  Luc De Raedt,et al.  Proceedings of the 12th European Conference on Machine Learning , 2001 .

[5]  David Stuart Robertson,et al.  Dynamic and Distributed Interaction Protocols , 2004, Adaptive Agents and Multi-Agent Systems.

[6]  Ann Nowé,et al.  Adaptive load balancing of parallel applications with multi-agent reinforcement learning on heterogeneous systems , 2004, Sci. Program..

[7]  Ann Nowé,et al.  Exploring selfish reinforcement learning in repeated games with stochastic rewards , 2007, Autonomous Agents and Multi-Agent Systems.

[8]  Dino Pedreschi,et al.  Machine Learning: ECML 2004 , 2004, Lecture Notes in Computer Science.

[9]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[10]  Daniel Kudenko,et al.  Baselines for Joint-Action Reinforcement Learning of Coordination in Cooperative Multi-agent Systems , 2004, Adaptive Agents and Multi-Agent Systems.

[11]  Daniel Kudenko,et al.  Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems , 2005, Adaptive Agents and Multi-Agent Systems.

[12]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[13]  Tom Lenaerts,et al.  An Evolutionary Game Theoretic Perspective on Learning in Multi-Agent Systems , 2004, Synthese.

[14]  Kumpati S. Narendra,et al.  Learning automata approach to hierarchical multiobjective analysis , 1991, IEEE Trans. Syst. Man Cybern..

[15]  P. S. Sastry,et al.  Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[16]  Colin M. Caplan,et al.  New Haven, Connecticut , 2006 .

[17]  Karl Tuyls,et al.  Analyzing Multi-agent Reinforcement Learning Using Evolutionary Dynamics , 2004, ECML.

[18]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[19]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[20]  Maarten Peeters,et al.  Multi-Agent Learning in Conflicting Multi-Level Games with Incomplete Information , 2004, AAAI Technical Report.

[21]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[22]  Ann Now,et al.  Coordinated Exploration in Stochastic Common Interest Games (Extended Abstract) , 2003 .

[23]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..