暂无分享,去创建一个
Tamer Basar | David S. Leslie | Asuman Ozdaglar | Muhammed O. Sayin | Kaiqing Zhang | A. Ozdaglar | T. Başar | D. Leslie | K. Zhang | M. O. Sayin
[1] Yuandong Tian,et al. Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games , 2021, ArXiv.
[2] J. Hofbauer,et al. Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .
[3] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[4] R. Karp,et al. On Nonterminating Stochastic Games , 1966 .
[5] Devavrat Shah,et al. On Reinforcement Learning for Turn-based Zero-sum Markov Games , 2020, FODS.
[6] Serdar Yüksel,et al. Decentralized Q-Learning for Stochastic Teams and Games , 2015, IEEE Transactions on Automatic Control.
[7] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[8] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[9] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[10] William H. Sandholm,et al. Preference Evolution, Two-Speed Dynamics, and Rapid Social Change , 2001 .
[11] Tiancheng Yu,et al. Provably Efficient Online Agnostic Learning in Markov Games , 2020, ArXiv.
[12] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[13] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[14] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[15] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[16] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[17] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[18] D. Fudenberg,et al. Learning and Equilibrium , 2009 .
[19] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[20] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[21] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[22] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[23] Lin F. Yang,et al. Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity , 2019, AISTATS.
[24] Noah Golowich,et al. Independent Policy Gradient Methods for Competitive Reinforcement Learning , 2021, NeurIPS.
[25] Dean Phillips Foster,et al. Regret Testing: Learning to Play Nash Equilibrium Without Knowing You Have an Opponent , 2006 .
[26] Daniel J. Singer. To the Best of Our Knowledge , 2021, The Philosophical Review.
[27] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[28] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[29] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.
[30] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[31] M. Benaïm. Dynamics of stochastic approximation algorithms , 1999 .
[32] Chen-Yu Wei,et al. Online Reinforcement Learning in Stochastic Games , 2017, NIPS.
[33] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[34] Etienne Perot,et al. Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.
[35] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[36] C. Harris. On the Rate of Convergence of Continuous-Time Fictitious Play , 1998 .
[37] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[38] Stef Tijs,et al. Fictitious play applied to sequences of games and discounted stochastic games , 1982 .
[39] Qiaomin Xie,et al. Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium , 2020, COLT 2020.
[40] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[41] Bruno Scherrer,et al. Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games , 2015, ICML.
[42] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[43] L. Buşoniu,et al. A comprehensive survey of multi-agent reinforcement learning , 2011 .
[44] Ying Wang,et al. A machine-learning approach to multi-robot coordination , 2008, Eng. Appl. Artif. Intell..
[45] Shimon Whiteson,et al. Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.
[46] Tamer Basar,et al. Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games , 2019, NeurIPS.
[47] Chi Jin,et al. Near-Optimal Reinforcement Learning with Self-Play , 2020, NeurIPS.
[48] Sham M. Kakade,et al. Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity , 2020, NeurIPS.
[49] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[50] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[51] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[52] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[53] Chi Jin,et al. Provable Self-Play Algorithms for Competitive Reinforcement Learning , 2020, ICML.
[54] L. Shapley,et al. Fictitious Play Property for Games with Identical Interests , 1996 .
[55] Vivek S. Borkar,et al. Reinforcement Learning in Markovian Evolutionary Games , 2002, Adv. Complex Syst..
[56] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[57] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[58] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[59] Josef Hofbauer,et al. Learning in perturbed asymmetric games , 2005, Games Econ. Behav..
[60] Zhengyuan Zhou,et al. Learning in games with continuous action sets and unknown payoff functions , 2019, Math. Program..
[61] David S. Leslie,et al. Individual Q-Learning in Normal Form Games , 2005, SIAM J. Control. Optim..
[62] Qinghua Liu,et al. A Sharp Analysis of Model-based Reinforcement Learning with Self-Play , 2020, ICML.
[63] Zibo Xu,et al. Best-response dynamics in zero-sum stochastic games , 2020, J. Econ. Theory.
[64] A. M. Fink,et al. Equilibrium in a stochastic $n$-person game , 1964 .
[65] Zhuoran Yang,et al. Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games , 2021, ICML.
[66] Olivier Pietquin,et al. Actor-Critic Fictitious Play in Simultaneous Move Multistage Games , 2018, AISTATS.
[67] A. Ozdaglar,et al. Fictitious play in zero-sum stochastic games , 2020, SIAM J. Control. Optim..
[68] Haipeng Luo,et al. Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games , 2021, COLT.
[69] Jeffrey C. Ely,et al. Nash Equilibrium and the Evolution of Preferences , 2001, J. Econ. Theory.