Generalized multiagent learning with performance bound
暂无分享,去创建一个
[1] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[2] J. Hofbauer,et al. Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .
[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[4] Bikramjit Banerjee,et al. Performance Bounded Reinforcement Learning in Strategic Interactions , 2004, AAAI.
[5] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[6] Eric van Damme,et al. Non-Cooperative Games , 2000 .
[7] Vincent Conitzer,et al. BL-WoLF: A Framework For Loss-Bounded Learnability In Zero-Sum Games , 2003, ICML.
[8] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[9] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[10] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[11] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[12] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[13] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[14] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[15] Tuomas Sandholm,et al. On Multiagent Q-Learning in a Semi-Competitive Domain , 1995, Adaption and Learning in Multi-Agent Systems.
[16] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[17] M. Nowak,et al. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game , 1993, Nature.
[18] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[19] Jeffrey S. Rosenschein,et al. Best-response multiagent learning in non-stationary environments , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[20] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[21] Sandip Sen,et al. Adaption and Learning in Multi-Agent Systems , 1995, Lecture Notes in Computer Science.
[22] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[23] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[24] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[25] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[26] L. Shapley. A note on the Lemke-Howson algorithm , 1974 .
[27] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[28] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[29] Gunes Ercal,et al. On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.
[30] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[31] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[32] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[33] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[34] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[35] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[36] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[37] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.