Probably Almost Stable Strategy Profiles in Simulation-Based Games

Empirical studies of strategic settings commonly model player interactions under supposed game-theoretic equilibrium behavior, to predict what rational agents might do. But in sufficiently complex settings, analysts cannot solve for exact equilibria, and may resort to solving a restricted game where agents are limited to a tractable subset of strategies. This provides a solution, but one with unclear strategic stability in the original game. We propose a search and evaluation method that can guarantee a well-defined strategic stability property in the profile that it yields, even if only a small subset of possible strategies in a game have been analyzed. The method achieves this result by combining statistical confidence interval estimation, a multiple test correction, and empirical game-theoretic analysis. We also present an extension of the method that more often finds genuine approximate equilibria, by using simulated annealing instead of simple random search for strategy exploration. We demonstrate efficacy in two example settings: the first-price sealed-bid auction, and a cybersecurity game.

[1]  David Silver,et al.  A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.

[2]  Michael P. Wellman,et al.  Multi-Stage Attack Graph Security Games: Heuristic Strategies, with Empirical Game-Theoretic Analysis , 2017, MTD@CCS.

[3]  O. Mangasarian Equilibrium Points of Bimatrix Games , 1964 .

[4]  Michael P. Wellman,et al.  Strategy exploration in empirical games , 2010, AAMAS.

[5]  Michael P. Wellman,et al.  Strategic Payment Routing in Financial Credit Networks , 2016, EC.

[6]  Cynthia A. Phillips,et al.  A graph-based system for network-vulnerability analysis , 1998, NSPW '98.

[7]  Michael P. Wellman,et al.  Stronger CDA strategies through empirical game-theoretic analysis and reinforcement learning , 2009, AAMAS.

[8]  Demosthenis Teneketzis,et al.  Optimal Defense Policies for Partially Observable Spreading Processes on Bayesian Attack Graphs , 2015, MTD@CCS.

[9]  E. S. Pearson,et al.  THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[10]  Michael P. Wellman,et al.  Generating trading agent strategies: Analytic and empirical methods for infinite and large games , 2005 .

[11]  Avrim Blum,et al.  Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[12]  Elena Katok,et al.  Regret and Feedback Information in First-Price Sealed-Bid Auctions , 2008, Manag. Sci..

[13]  Ashish Sureka,et al.  Using tabu best-response search to find pure strategy nash equilibria in normal form games , 2005, AAMAS '05.

[14]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[15]  M. Thulin The cost of using exact confidence intervals for a binomial proportion , 2013, 1303.1288.

[16]  Michael P. Wellman,et al.  Welfare Effects of Market Making in Continuous Double Auctions , 2015, AAMAS.

[17]  L. Brown,et al.  Interval Estimation for a Binomial Proportion , 2001 .

[18]  Paul W. Goldberg,et al.  Bounds for the convergence rate of randomized local search in a multiplayer load-balancing game , 2004, PODC '04.

[19]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[20]  Andrew McLennan,et al.  Gambit: Software Tools for Game Theory , 2006 .

[21]  D. Sculley,et al.  Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[22]  Dario Izzo,et al.  Space Debris Removal: A Game Theoretic Analysis , 2016, Games.

[23]  Maria Domenica Di Benedetto,et al.  Randomized sampling for large zero-sum games , 2010, 49th IEEE Conference on Decision and Control (CDC).