论文信息 - Learning Equilibria of Simulation-Based Games - 字舞流文

Learning Equilibria of Simulation-Based Games

We tackle a fundamental problem in empirical game-theoretic analysis (EGTA), that of learning equilibria of simulation-based games. Such games cannot be described in analytical form; instead, a black-box simulator can be queried to obtain noisy samples of utilities. Our approach to EGTA is in the spirit of probably approximately correct learning. We design algorithms that learn so-called empirical games, which uniformly approximate the utilities of simulation-based games with finite-sample guarantees. These algorithms can be instantiated with various concentration inequalities. Building on earlier work, we first apply Hoeffding's bound, but as the size of the game grows, this bound eventually becomes statistically intractable; hence, we also use the Rademacher complexity. Our main results state: with high probability, all equilibria of the simulation-based game are approximate equilibria in the empirical game (perfect recall); and conversely, all approximate equilibria in the empirical game are approximate equilibria in the simulation-based game (approximately perfect precision). We evaluate our algorithms on several synthetic games, showing that they make frugal use of data, produce accurate estimates more often than the theory predicts, and are robust to different forms of noise.

Eli Upfal | Amy Greenwald | Enrique Areyan Viqueira | Cyrus Cousins | E. Upfal | A. Greenwald | Cyrus Cousins

[1] Michael P. Wellman,et al. Analyzing Incentives for Protocol Compliance in Complex Domains: A Case Study of Introduction-Based Routing , 2013, ArXiv.

[2] Tim Roughgarden,et al. How bad is selfish routing? , 2002, JACM.

[3] Joel Z. Leibo,et al. A Generalised Method for Empirical Game Theoretic Analysis , 2018, AAMAS.

[4] Tapio Elomaa,et al. Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees , 2004, J. Mach. Learn. Res..

[5] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[6] Michael P. Wellman,et al. Stochastic Search Methods for Nash Equilibrium Approximation in Simulation-based Games , 2022 .

[7] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[8] Michael H. Bowling,et al. AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games , 2016, AAAI.

[9] Michael P. Wellman,et al. Empirical mechanism design: methods, with application to a supply-chain scenario , 2006, EC '06.

[10] Michael P. Wellman,et al. Learning payoff functions in infinite games , 2005, Machine Learning.

[11] S. Boucheron,et al. Theory of classification : a survey of some recent advances , 2005 .

[12] Susan A. Murphy,et al. Monographs on statistics and applied probability , 1990 .

[13] Paul W. Goldberg,et al. Learning equilibria of games via payoff queries , 2013, EC '13.

[14] R. Rosenthal. A class of games possessing pure-strategy Nash equilibria , 1973 .

[15] Michael P. Wellman,et al. Designing an Ad Auctions Game for the Trading Agent Competition , 2009, AMEC/TADA.

[16] Wolfgang Ketter,et al. Autonomous Agents in Future Energy Markets: The 2012 Power Trading Agent Competition , 2013, AAAI.

[17] Michael P. Wellman,et al. A Regression Approach for Modeling Games With Many Symmetric Players , 2018, AAAI.

[18] Yevgeniy Vorobeychik,et al. Probabilistic analysis of simulation-based games , 2010, TOMC.

[19] Michael P. Wellman. Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.

[20] Michael Mitzenmacher,et al. A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[21] R. Gibbons. Game theory for applied economists , 1992 .

[22] Arunesh Sinha,et al. Bounding Regret in Simulated Games , 2018 .

[23] Rajarshi Das,et al. Choosing Samples to Compute Heuristic-Strategy Nash Equilibrium , 2003, AMEC.

[24] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[25] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.

[26] Eli Upfal,et al. Mining Frequent Itemsets through Progressive Sampling with Rademacher Averages , 2015, KDD.

[27] Victor Picheny,et al. A Bayesian optimization approach to find Nash equilibria , 2016, J. Glob. Optim..

[28] Bryce Wiedenbeck. Approximate game theoretic analysis for large simulation-based games , 2014, AAMAS.

[29] Robert Tibshirani,et al. An Introduction to the Bootstrap , 1994 .

[30] Eli Upfal,et al. ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with Rademacher Averages , 2016, KDD.

[31] Michael P. Wellman,et al. Empirical game-theoretic analysis of the TAC Supply Chain game , 2007, AAMAS '07.

[32] P. Massart. Some applications of concentration inequalities to statistics , 2000 .

[33] Paul W. Goldberg,et al. The Complexity of Computing a Nash Equilibrium , 2009, SIAM J. Comput..

[34] Elias Koutsoupias,et al. The price of anarchy of finite congestion games , 2005, STOC '05.

[35] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[36] Yakov Babichenko,et al. Query complexity of approximate nash equilibria , 2013, STOC.

[37] Vladimir Koltchinskii,et al. Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.

[38] H. Simon,et al. ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[39] Luiz Chaimowicz,et al. Rock, Paper, StarCraft: Strategy Selection in Real-Time Strategy Games , 2016, AIIDE.

[40] Tapio Elomaa,et al. Progressive rademacher sampling , 2002, AAAI/IAAI.

[41] Michael P. Wellman,et al. Searching for approximate equilibria in empirical games , 2008, AAMAS.