Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation: Faster Algorithms and Positive Complexity Results

We focus on the problem of finding an optimal strategy for a team of players that faces an opponent in an imperfect-information zero-sum extensive-form game. Team members are not allowed to communicate during play but can coordinate before the game. In this setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. In this paper, we first provide new modeling results about computing such an optimal distribution by drawing a connection to a different literature on extensive-form correlation. Second, we provide an algorithm that allows one for capping the number of profiles employed in the solution. This begets an anytime algorithm by increasing the cap. We find that often a handful of well-chosen such profiles suffices to reach optimal utility for the team. This enables team members to reach coordination through a simple and understandable plan. Finally, inspired by this observation and leveraging theoretical concepts that we introduce, we develop an efficient column-generation algorithm for finding an optimal distribution for the team. We evaluate it on a suite of common benchmark games. It is three orders of magnitude faster than the prior state of the art on games that the latter can solve and it can also solve several games that were previously unsolvable. Equal contribution Computer Science Department, Carnegie Mellon University, Pittsburgh PA 15213 DEIB, Politecnico di Milano, Milano, Italy Strategic Machine, Inc. Strategy Robot, Inc. Optimized Markets, Inc.. Correspondence to: Gabriele Farina <gfarina@cs.cmu.edu>, Andrea Celli <andrea.celli@polimi.it>, Nicola Gatti <nicola.gatti@polimi.it>, Tuomas Sandholm <sandholm@cs.cmu.edu>. Proceedings of the 38 th International Conference on Machine Learning, PMLR 139, 2021. Copyright 2021 by the author(s).

[1]  B. Stengel,et al.  Efficient Computation of Behavior Strategies , 1996 .

[2]  H. W. Kuhn,et al.  11. Extensive Games and the Problem of Information , 1953 .

[3]  Tuomas Sandholm,et al.  Polynomial-Time Computation of Optimal Correlated Equilibria in Two-Player Extensive-Form Games with Public Chance Moves and Beyond , 2020, NeurIPS.

[4]  Lester Randolph Ford,et al.  A Suggested Computation for Maximal Multi-Commodity Network Flows , 2004, Manag. Sci..

[5]  Lasse Becker-Czarnetzki Report on DeepStack Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker , 2019 .

[6]  Michael H. Bowling,et al.  Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.

[7]  Kevin Waugh,et al.  Monte Carlo Sampling for Regret Minimization in Extensive Games , 2009, NIPS.

[8]  Bo An,et al.  Computing Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games , 2020, AAAI.

[9]  Neil Burch,et al.  Heads-up limit hold’em poker is solved , 2015, Science.

[10]  S. Ross GOOFSPIEL -- THE GAME OF PURE STRATEGY , 1971 .

[11]  Bernhard von Stengel,et al.  Extensive-Form Correlated Equilibrium: Definition and Computational Complexity , 2008, Math. Oper. Res..

[12]  Bo An,et al.  Computing Ex Ante Coordinated Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games , 2020, ArXiv.

[13]  Michael H. Bowling,et al.  Solving Heads-Up Limit Texas Hold'em , 2015, IJCAI.

[14]  Tuomas Sandholm,et al.  Ex ante coordination and collusion in zero-sum multi-player extensive-form games , 2018, NeurIPS.

[15]  B. Stengel,et al.  Team-Maxmin Equilibria☆ , 1997 .

[16]  D. Koller,et al.  Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[17]  Nicola Gatti,et al.  Computational Results for Extensive-Form Adversarial Team Games , 2017, AAAI.

[18]  Nicola Basilico,et al.  Team-Maxmin Equilibrium: Efficiency Bounds and Algorithms , 2016, AAAI.

[19]  Michael H. Bowling,et al.  Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games , 2015, AAMAS.

[20]  Noam Brown,et al.  Superhuman AI for multiplayer poker , 2019, Science.

[21]  Bo An,et al.  Converging to Team-Maxmin Equilibria in Zero-Sum Multiplayer Games , 2020, ICML.

[22]  Kevin Waugh,et al.  DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.