论文信息 - Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

In this paper, we adopt general-sum stochastic games as a framework for multiagent reinforcement learning. Our work extends previous work by Littman on zero-sum stochastic games to a broader framework. We design a multiagent Q-learning method under this framework, and prove that it converges to a Nash equilibrium under speci ed conditions. This algorithm is useful for nding the optimal strategy when there exists a unique Nash equilibrium in the game. When there exist multiple Nash equilibria in the game, this algorithm should be combined with other learning techniques to nd optimal strategies.

Michael P. Wellman | Junling Hu | Junling Hu

[1] J. Nash. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.

[2] O. Mangasarian,et al. Two-person nonzero-sum games and quadratic programming , 1964 .

[3] Frank Thuijsman,et al. Optimality and equilibria in stochastic games , 1992 .

[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[6] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[8] Tucker Balch,et al. Learning Roles: Behavioral Diversity in Robot Teams , 1997 .

[9] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[10] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.