AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
暂无分享,去创建一个
[1] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[2] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.
[3] 宮沢 光一. On the convergence of the learning process in a 2 x 2 non-zero-sum two-person game , 1961 .
[4] C. E. Lemke,et al. Equilibrium Points of Bimatrix Games , 1964 .
[5] S. Vajda. Some topics in two-person games , 1971 .
[6] R. Aumann. Subjectivity and Correlation in Randomized Strategies , 1974 .
[7] H. Simon,et al. Models of Bounded Rationality: Empirically Grounded Economic Reason , 1997 .
[8] L. C. Thomas,et al. Stochastic Games with Finite State and Action Spaces , 1988 .
[9] Eitan Zemel,et al. Nash and correlated equilibria: Some complexity considerations , 1989 .
[10] John Nachbar. “Evolutionary” selection dynamics in games: Convergence and limit properties , 1990 .
[11] E. Kalai,et al. Rational Learning Leads to Nash Equilibrium , 1993 .
[12] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[13] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[14] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[15] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[16] John Nachbar. Prediction, optimization, and learning in repeated games , 1997 .
[17] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[18] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.
[19] Dean P. Foster,et al. Calibrated Learning and Correlated Equilibrium , 1997 .
[20] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .
[21] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[22] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[23] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[24] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[25] Sandip Sen,et al. Learning in multiagent systems , 1999 .
[26] D. Fudenberg,et al. Conditional Universal Consistency , 1999 .
[27] Ronen I. Brafman,et al. A near-optimal polynomial time algorithm for learning in certain classes of stochastic games , 2000, Artif. Intell..
[28] Leonid Sheremetov,et al. Weiss, Gerhard. Multiagent Systems a Modern Approach to Distributed Artificial Intelligence , 2009 .
[29] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[30] Bikramjit Banerjee,et al. Fast Concurrent Reinforcement Learners , 2001, IJCAI.
[31] Michael A. Goodrich,et al. Satisficing and Learning Cooperation in the Prisoner s Dilemma , 2001, IJCAI.
[32] Christos H. Papadimitriou,et al. Algorithms, Games, and the Internet , 2001, ICALP.
[33] Gunes Ercal,et al. On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.
[34] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[35] H P Young,et al. On the impossibility of predicting the behavior of rational agents , 2001, Proceedings of the National Academy of Sciences of the United States of America.
[36] John Nachbar,et al. Bayesian learning in repeated games of incomplete information , 2001, Soc. Choice Welf..
[37] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[38] Ronen I. Brafman,et al. Efficient learning equilibrium , 2004, Artificial Intelligence.
[39] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[40] Yoav Shoham,et al. Polynomial-time reinforcement learning of near-optimal policies , 2002, AAAI/IAAI.
[41] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[42] Amy Greenwald,et al. A General Class of No-Regret Learning Algorithms and Game-Theoretic Equilibria , 2003, COLT.
[43] Vincent Conitzer,et al. BL-WoLF: A Framework For Loss-Bounded Learnability In Zero-Sum Games , 2003, ICML.
[44] S. Hart,et al. Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .
[45] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[46] Peter Stone,et al. A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.
[47] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[48] Tuomas Sandholm,et al. Learning Near-Pareto-Optimal Conventions in Polynomial Time , 2003, NIPS.
[49] Vincent Conitzer,et al. Complexity Results about Nash Equilibria , 2002, IJCAI.
[50] Vincent Conitzer,et al. Communication complexity as a lower bound for learning in games , 2004, ICML.
[51] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[52] Bikramjit Banerjee,et al. Performance Bounded Reinforcement Learning in Strategic Interactions , 2004, AAAI.
[53] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[54] Amotz Cahn,et al. General procedures leading to correlated equilibria , 2004, Int. J. Game Theory.
[55] Sham M. Kakade,et al. Deterministic calibration and Nash equilibrium , 2004, J. Comput. Syst. Sci..
[56] Vincent Conitzer,et al. Mixed-Integer Programming Methods for Finding Nash Equilibria , 2005, AAAI.
[57] Ronen I. Brafman,et al. Optimal Efficient Learning Equilibrium: Imperfect Monitoring in Symmetric Games , 2005, AAAI.
[58] Yoav Shoham,et al. Learning against opponents with bounded memory , 2005, IJCAI.
[59] Yoav Shoham,et al. Simple search methods for finding a Nash equilibrium , 2004, Games Econ. Behav..