论文信息 - Friend-or-Foe Q-learning in General-Sum Games

Friend-or-Foe Q-learning in General-Sum Games

This paper describes an approach to reinforcement learning in multiagent general-sum games in which a learner is told to treat each other agent as either a \friend" or \foe". This Q-learning-style algorithm provides strong convergence guarantees compared to an existing Nash-equilibrium-based learning rule.

Michael L. Littman | M. Littman

[1] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[2] J. van der Wal,et al. Stochastic dynamic programming : successive approximations and nearly optimal strategies for markov decision processes and markov games , 1981 .

[3] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[6] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[7] J. Filar,et al. Competitive Markov Decision Processes , 1996 .

[8] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[9] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.

[10] Michael P. Wellman,et al. Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.

[11] Peter Stone,et al. Leading Best-Response Strategies in Repeated Games , 2001, International Joint Conference on Artificial Intelligence.