Improved Algorithms for Learning Equilibria in Simulation-Based Games

We tackle a fundamental problem in empirical game-theoretic analysis (EGTA), that of learning equilibria of simulation-based games. Such games cannot be described in analytical form; instead, a blackbox simulator can be queried to obtain noisy samples of utilities. Our first theorem establishes that uniform approximations of simulationbased games are equilibrium preserving. We then design algorithms that uniformly approximate simulation-based games with finitesample guarantees. Our first algorithm, global sampling (GS), extends previous work that constructs confidence intervals assuming bounded utilities with confidence intervals that are sensitive to variance. The second, progressive sample with pruning (PSP), samples progressively, ceasing the sampling process (i.e., pruning strategies) as soon as it determines that the corresponding utilities have been sufficiently well estimated for equilibrium computation. We experiment with our algorithms using both GAMUT, a state-ofthe-art game generator, and Gambit, a state-of-the-art game solver. For a broad swath of games, we show that GS using our variancesensitive bounds outperforms previous work, and that PSP can significantly outperform GS. Here “outperform” means achieving the same guarantees with far fewer samples.

[1]  Michael P. Wellman,et al.  Searching for approximate equilibria in empirical games , 2008, AAMAS.

[2]  Yoav Shoham,et al.  Run the GAMUT: a comprehensive approach to evaluating game-theoretic algorithms , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[3]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[4]  Tapio Elomaa,et al.  Progressive rademacher sampling , 2002, AAAI/IAAI.

[5]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[6]  Massimiliano Pontil,et al.  Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[7]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Eli Upfal,et al.  Mining Frequent Itemsets through Progressive Sampling with Rademacher Averages , 2015, KDD.

[9]  Victor Picheny,et al.  A Bayesian optimization approach to find Nash equilibria , 2016, J. Glob. Optim..

[10]  Bryce Wiedenbeck Approximate game theoretic analysis for large simulation-based games , 2014, AAMAS.

[11]  Michael P. Wellman,et al.  Designing an Ad Auctions Game for the Trading Agent Competition , 2009, AMEC/TADA.

[12]  Wolfgang Ketter,et al.  Autonomous Agents in Future Energy Markets: The 2012 Power Trading Agent Competition , 2013, AAAI.

[13]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[14]  R. Gibbons Game theory for applied economists , 1992 .

[15]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[16]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[17]  Michael P. Wellman,et al.  Stochastic Search Methods for Nash Equilibrium Approximation in Simulation-based Games , 2022 .

[18]  Amy Greenwald,et al.  Empirical Mechanism Design: Designing Mechanisms from Data , 2019, UAI.

[19]  Andrew McLennan,et al.  Gambit: Software Tools for Game Theory , 2006 .

[20]  G. Bennett Probability Inequalities for the Sum of Independent Random Variables , 1962 .

[21]  Csaba Szepesvári,et al.  Variance estimates and exploration function in multi-armed bandit , 2008 .

[22]  Eli Upfal,et al.  ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with Rademacher Averages , 2016, KDD.

[23]  Michael P. Wellman,et al.  Analyzing Incentives for Protocol Compliance in Complex Domains: A Case Study of Introduction-Based Routing , 2013, ArXiv.

[24]  Robert Wilson,et al.  A global Newton method to compute Nash equilibria , 2003, J. Econ. Theory.

[25]  Joel Z. Leibo,et al.  A Generalised Method for Empirical Game Theoretic Analysis , 2018, AAMAS.

[26]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[27]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[28]  Csaba Szepesvári,et al.  Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.

[29]  Yoav Shoham,et al.  Learning the Empirical Hardness of Optimization Problems: The Case of Combinatorial Auctions , 2002, CP.

[30]  Eli Upfal,et al.  Learning Simulation-Based Games from Data , 2019, AAMAS.

[31]  Yevgeniy Vorobeychik,et al.  Probabilistic analysis of simulation-based games , 2010, TOMC.

[32]  Michael P. Wellman Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.

[33]  Michael P. Wellman,et al.  Empirical mechanism design: methods, with application to a supply-chain scenario , 2006, EC '06.

[34]  Eli Upfal,et al.  Learning Equilibria of Simulation-Based Games , 2019, ArXiv.

[35]  Michael P. Wellman,et al.  Empirical game-theoretic analysis of the TAC Supply Chain game , 2007, AAMAS '07.

[36]  Luiz Chaimowicz,et al.  Rock, Paper, StarCraft: Strategy Selection in Real-Time Strategy Games , 2016, AIIDE.