Distributed Planning in Stochastic Games with Communication

This paper treats the problem of distributed planning in general-sum stochastic games with communication when the model is known. Our main contribution is a novel, game theoretic approach to the problem of distributed equilibrium computation and selection. We show theoretically and via experiments that our approach, when adopted by all agents, facilitates an efficient distributed equilibrium computation and leads to a unique equilibrium selection in general-sum stochastic games with communication.

[1]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[2]  Editors , 1986, Brain Research Bulletin.

[3]  O. J. Vrieze,et al.  Stochastic Games with Finite State and Action Spaces. , 1988 .

[4]  L. C. Thomas,et al.  Stochastic Games with Finite State and Action Spaces , 1988 .

[5]  Roger B. Myerson,et al.  Game theory - Analysis of Conflict , 1991 .

[6]  H. Young,et al.  The Evolution of Conventions , 1993 .

[7]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[8]  R. McKelvey,et al.  Computation of equilibria in finite games , 1996 .

[9]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[10]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[11]  Yishay Mansour,et al.  Fast Planning in Stochastic Games , 2000, UAI.

[12]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[13]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[14]  Xiaofeng Wang,et al.  Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.

[15]  Yoav Shoham,et al.  Multi-Agent Reinforcement Learning:a critical survey , 2003 .

[16]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[17]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[18]  Bikramjit Banerjee,et al.  Performance Bounded Reinforcement Learning in Strategic Interactions , 2004, AAAI.

[19]  M. N. Vrahatis,et al.  Computing Nash equilibria through computational intelligence methods , 2005 .

[20]  Michael L. Littman,et al.  Cyclic Equilibria in Markov Games , 2005, NIPS.

[21]  Vincent Conitzer,et al.  AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.