Exploration Analysis in Finite-Horizon Turn-based Stochastic Games
暂无分享,去创建一个
Tongzheng Ren | Jun Zhu | Jialian Li | Yichi Zhou | Jun Zhu | Tongzheng Ren | Yichi Zhou | J. Li
[1] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .
[2] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[3] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[4] Chen-Yu Wei,et al. Online Reinforcement Learning in Stochastic Games , 2017, NIPS.
[5] Lihong Li,et al. Policy Certificates: Towards Accountable Reinforcement Learning , 2018, ICML.
[6] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[7] Christoph Dann,et al. Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning , 2015, NIPS.
[8] Richard Rouse,et al. Game design : theory and practice , 2001 .
[9] Bruno Scherrer,et al. Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games , 2015, ICML.
[10] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[11] Emma Brunskill,et al. Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds , 2019, ICML.
[12] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[13] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[14] Karl Tuyls,et al. Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent , 2019, IJCAI.
[15] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.
[16] Jun Zhu,et al. Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information , 2020, ICLR.
[17] Kevin Waugh,et al. Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.
[18] Michail G. Lagoudakis,et al. Value Function Approximation in Zero-Sum Markov Games , 2002, UAI.
[19] Tor Lattimore,et al. Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning , 2017, NIPS.
[20] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[21] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[22] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[23] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.