暂无分享,去创建一个
Michael H. Bowling | Michael Bowling | Marc Lanctot | Karl Tuyls | Mohammad Gheshlaghi Azar | Dustin Morrill | Jean-Baptiste Lespiau | Julien Perolat | Audrunas Gruslys | Martin Schmid | Finbarr Timbers | R'emi Munos | Vinicius Zambaldi | John Schultz | R. Munos | M. G. Azar | V. Zambaldi | Marc Lanctot | A. Gruslys | K. Tuyls | J. Pérolat | J. Lespiau | Martin Schmid | Dustin Morrill | Finbarr Timbers | John Schultz
[1] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .
[2] Tuomas Sandholm,et al. Deep Counterfactual Regret Minimization , 2018, ICML.
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[5] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[6] Lasse Becker-Czarnetzki. Report on DeepStack Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker , 2019 .
[7] Michael H. Bowling,et al. Tractable Objectives for Robust Policy Optimization , 2012, NIPS.
[8] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[9] Adam Lerer,et al. DREAM: Deep Regret minimization with Advantage baselines and Model-free learning , 2020, ArXiv.
[10] Nicola Gatti,et al. Learning to Correlate in Multi-Player General-Sum Sequential Games , 2019, NeurIPS.
[11] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[12] Peter L. Bartlett,et al. POLITEX: Regret Bounds for Policy Iteration using Expert Prediction , 2019, ICML.
[13] Honglak Lee,et al. Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.
[14] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[15] Kurt Keutzer,et al. Regret Minimization for Partially Observable Deep Reinforcement Learning , 2017, ICML.
[16] Michael H. Bowling,et al. Actor-Critic Policy Optimization in Partially Observable Multiagent Environments , 2018, NeurIPS.
[17] Michael H. Bowling,et al. Solving Imperfect Information Games Using Decomposition , 2013, AAAI.
[18] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.
[19] Michael H. Bowling,et al. Rethinking Formal Models of Partially Observable Multiagent Decision Making , 2019, Artif. Intell..
[20] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[21] Y. Mansour,et al. Algorithmic Game Theory: Learning, Regret Minimization, and Equilibria , 2007 .
[22] Karl Tuyls,et al. Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent , 2019, IJCAI.
[23] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.
[24] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[25] Michael H. Bowling,et al. Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines , 2018, AAAI.
[26] Ian A. Kash,et al. Combining No-regret and Q-learning , 2019, AAMAS.
[27] Kevin Waugh,et al. Solving Games with Functional Regret Estimation , 2014, AAAI Workshop: Computer Poker and Imperfect Information.
[28] Tuomas Sandholm,et al. Solving Imperfect-Information Games via Discounted Regret Minimization , 2018, AAAI.
[29] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[30] Yuan Qi,et al. Double Neural Counterfactual Regret Minimization , 2018, ICLR.
[31] Nicola Gatti,et al. No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium , 2020, NeurIPS.
[32] Kevin Waugh,et al. Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.
[33] Yishay Mansour,et al. Experts in a Markov Decision Process , 2004, NIPS.
[34] Sriram Srinivasan,et al. OpenSpiel: A Framework for Reinforcement Learning in Games , 2019, ArXiv.
[35] Michael H. Bowling,et al. Eqilibrium Approximation Quality of Current No-Limit Poker Bots , 2016, AAAI Workshops.
[36] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[37] Duane Szafron,et al. Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions , 2012, NIPS.
[38] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.