Regret Circuits: Composability of Regret Minimizers

Regret minimization is a powerful tool for solving large-scale problems; it was recently used in breakthrough results for large-scale extensive-form game solving. This was achieved by composing simplex regret minimizers into an overall regret-minimization framework for extensive-form game strategy spaces. In this paper we study the general composability of regret minimizers. We derive a calculus for constructing regret minimizers for composite convex sets that are obtained from convexity-preserving operations on simpler convex sets. We show that local regret minimizers for the simpler sets can be combined with additional regret minimizers into an aggregate regret minimizer for the composite set. As one application, we show that the CFR framework can be constructed easily from our framework. We also show ways to include curtailing (constraining) operations into our framework. For one, they enables the construction of CFR generalization for extensive-form games with general convex strategy constraints that can cut across decision points.

[1]  J. Zico Kolter,et al.  What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[2]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[3]  H. Brendan McMahan,et al.  Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization , 2011, AISTATS.

[4]  Kevin Waugh,et al.  Faster algorithms for extensive-form game solving via improved smoothing functions , 2018, Mathematical Programming.

[5]  Tuomas Sandholm,et al.  Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning , 2017, ICML.

[6]  Tuomas Sandholm,et al.  Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.

[7]  Kevin Waugh,et al.  Solving Large Extensive-Form Games with Strategy Constraints , 2018, AAAI.

[8]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[9]  Tuomas Sandholm,et al.  Endgame Solving in Large Imperfect-Information Games , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[10]  Tuomas Sandholm,et al.  Regret Transfer and Parameter Optimization , 2014, AAAI.

[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12]  Tuomas Sandholm,et al.  Dynamic Thresholding and Pruning for Regret Minimization , 2017, AAAI.

[13]  Tuomas Sandholm,et al.  Smoothing Method for Approximate Extensive-Form Perfect Equilibrium , 2017, IJCAI.

[14]  Neil Burch,et al.  Heads-up limit hold’em poker is solved , 2015, Science.

[15]  Nicola Gatti,et al.  Extensive-Form Perfect Equilibrium Computation in Two-Player Games , 2017, AAAI.

[16]  Stephen P. Boyd,et al.  Disciplined Convex Programming , 2006 .

[17]  Kevin Waugh,et al.  Faster First-Order Methods for Extensive-Form Game Solving , 2015, EC.

[18]  Noam Brown,et al.  Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.

[19]  Kevin Waugh,et al.  DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.

[20]  Tuomas Sandholm,et al.  Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games , 2018, AAAI.

[21]  Peter Bro Miltersen,et al.  Computing a quasi-perfect equilibrium of a two-player game , 2010 .

[22]  Tuomas Sandholm,et al.  Regret-Based Pruning in Extensive-Form Games , 2015, NIPS.

[23]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[24]  Tuomas Sandholm,et al.  Solving Imperfect-Information Games via Discounted Regret Minimization , 2018, AAAI.

[25]  Tuomas Sandholm,et al.  Practical exact algorithm for trembling-hand equilibrium refinements in games , 2018, NeurIPS.

[26]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[27]  Tuomas Sandholm,et al.  Strategy-Based Warm Starting for Regret Minimization in Games , 2016, AAAI.

[28]  Javier Peña,et al.  Smoothing Techniques for Computing Nash Equilibria of Sequential Games , 2010, Math. Oper. Res..

[29]  Dale Schuurmans,et al.  Deep Learning Games , 2016, NIPS.

[30]  Michael H. Bowling,et al.  Solving Imperfect Information Games Using Decomposition , 2013, AAAI.

[31]  Tuomas Sandholm,et al.  Regret Minimization in Behaviorally-Constrained Zero-Sum Games , 2017, ICML.

[32]  Michael H. Bowling,et al.  Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.

[33]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[34]  Milan Hladík,et al.  Refining Subgames in Large Imperfect Information Games , 2016, AAAI.

[35]  Tuomas Sandholm,et al.  Simultaneous Abstraction and Equilibrium Finding in Games , 2015, IJCAI.