Stochastic Regret Minimization in Extensive-Form Games
暂无分享,去创建一个
Tuomas Sandholm | Christian Kroer | Gabriele Farina | T. Sandholm | Christian Kroer | Gabriele Farina
[1] Tuomas Sandholm,et al. Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning , 2017, ICML.
[2] Michael H. Bowling,et al. Counterfactual Regret Minimization in Sequential Security Games , 2016, AAAI.
[3] Lasse Becker-Czarnetzki. Report on DeepStack Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker , 2019 .
[4] Branislav Bosanský,et al. Sequence-Form Algorithm for Computing Stackelberg Equilibria in Extensive-Form Games , 2015, AAAI.
[5] Branislav Bosanský,et al. An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information , 2014, J. Artif. Intell. Res..
[6] Tuomas Sandholm,et al. Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead , 2017, AAAI.
[7] Tuomas Sandholm,et al. Regret Circuits: Composability of Regret Minimizers , 2018, ICML.
[8] David A. Freedman,et al. On the Amount of Variance Needed to Escape from a strip , 1973 .
[9] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.
[10] Kazuoki Azuma. WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .
[11] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[12] S. Ross. GOOFSPIEL -- THE GAME OF PURE STRATEGY , 1971 .
[13] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[14] Kevin Waugh,et al. Faster First-Order Methods for Extensive-Form Game Solving , 2015, EC.
[15] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[16] B. Stengel,et al. Efficient Computation of Behavior Strategies , 1996 .
[17] Jacob D. Abernethy,et al. Beating the adaptive bandit with high probability , 2009, 2009 Information Theory and Applications Workshop.
[18] Tuomas Sandholm,et al. Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks , 2019, NeurIPS.
[19] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[20] Tuomas Sandholm,et al. Steering Evolution Strategically: Computational Game Theory and Opponent Exploitation for Treatment Planning, Drug Design, and Synthetic Biology , 2015, AAAI.
[21] Nicholas R. Jennings,et al. Introducing Alarms in Adversarial Patrolling Games , 2013 .
[22] Kevin Waugh,et al. Accelerating Best Response Calculation in Large Extensive Games , 2011, IJCAI.
[23] Michael H. Bowling,et al. Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines , 2018, AAAI.
[24] Tuomas Sandholm,et al. The State of Solving Large Incomplete-Information Games, and Application to Poker , 2010, AI Mag..
[25] Kevin Waugh,et al. Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.
[26] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[27] Michael H. Bowling,et al. Tractable Objectives for Robust Policy Optimization , 2012, NIPS.
[28] Duane Szafron,et al. Generalized Sampling and Variance in Counterfactual Regret Minimization , 2012, AAAI.
[29] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .
[30] Javier Peña,et al. Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..
[31] Christopher Archibald,et al. Modeling billiards games , 2009, AAMAS.
[32] Tuomas Sandholm,et al. Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games , 2018, AAAI.
[33] Yoram Singer,et al. A primal-dual perspective of online learning algorithms , 2007, Machine Learning.
[34] Tuomas Sandholm,et al. Power napping with loud neighbors: optimal energy-constrained jamming and anti-jamming , 2014, WiSec '14.
[35] Thomas P. Hayes,et al. High-Probability Regret Bounds for Bandit Online Linear Optimization , 2008, COLT.
[36] Francesco Orabona. A Modern Introduction to Online Learning , 2019, ArXiv.
[37] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[38] Tuomas Sandholm,et al. Regret-Based Pruning in Extensive-Form Games , 2015, NIPS.
[39] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[40] Tuomas Sandholm,et al. Solving Imperfect-Information Games via Discounted Regret Minimization , 2018, AAAI.
[41] Michael H. Bowling,et al. Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.
[42] Kevin Waugh,et al. Faster algorithms for extensive-form game solving via improved smoothing functions , 2018, Mathematical Programming.
[43] Tuomas Sandholm,et al. Dynamic Thresholding and Pruning for Regret Minimization , 2017, AAAI.
[44] D. Koller,et al. Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .
[45] Zhu Han,et al. Wireless Resource Scheduling in Virtualized Radio Access Networks Using Stochastic Learning , 2018, IEEE Transactions on Mobile Computing.