Computing Equilibria with Two-Player Zero-Sum Continuous Stochastic Games with Switching Controller

Equilibrium computation with continuous games is currently a challenging open task in artificial intelligence. In this paper, we design an iterative algorithm that finds an e-approximate Markov perfect equilibrium with two-player zero-sum continuous stochastic games with switching controller. When the game is polynomial (i.e., utility and state transitions are polynomial functions), our algorithm converges to e = 0 by exploiting semidefinite programming. When the game is not polynomial, the algorithm exploits polynomial approximations and converges to an e value whose upper bound is a function of the maximum approximation error with infinity norm. To our knowledge, this is the first algorithm for equilibrium approximation with arbitrary utility and transition functions providing theoretical guarantees. The algorithm is also empirically evaluated.

[1]  Jerzy A. Filar,et al.  A finite algorithm for the switching control stochastic game , 1983 .

[2]  J. Shohat,et al.  The problem of moments , 1943 .

[3]  Xiaotie Deng,et al.  Settling the complexity of computing two-player Nash equilibria , 2007, JACM.

[4]  Xi Chen,et al.  Computing Nash Equilibria: Approximation and Smoothed Complexity , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[5]  Marc Nerlove,et al.  Mathematical Methods and Theory in Games, Programming, and Economics, Vol. I: Matrix Games, Programming, and Mathematical Economics. Vol. II: The Theory of Infinite Games. Samuel Karlin , 1960 .

[6]  Marco Vianello,et al.  Algorithm xxx: Padua2D: Lagrange Interpolation at Padua Points on Bivariate Domains , 2008 .

[7]  Samuel Karlin,et al.  Mathematical Methods and Theory in Games, Programming, and Economics , 1961 .

[8]  Pablo A. Parrilo,et al.  Chapter 3: Polynomial Optimization, Sums of Squares, and Applications , 2012 .

[9]  P.A. Parrilo,et al.  Polynomial games and sum of squares optimization , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[10]  Nicola Gatti,et al.  Computing an Extensive-Form Perfect Equilibrium in Two-Player Games , 2011, AAAI.

[11]  E. Cheney Introduction to approximation theory , 1966 .

[12]  Michael P. Wellman,et al.  Stochastic Search Methods for Nash Equilibrium Approximation in Simulation-based Games , 2022 .

[13]  Gatti Nicola,et al.  Equilibrium Approximation in Extensive-Form Simulation-Based Games , 2011, AAMAS 2011.

[14]  Nicola Gatti,et al.  New results on the verification of Nash refinements for extensive-form games , 2012, AAMAS.

[15]  Johan Löfberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[16]  M. Dufwenberg Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.

[17]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[18]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[19]  Samuel Karlin,et al.  Mathematical Methods and Theory in Games, Programming, and Economics , 1961 .

[20]  S Karlin Continuous Games. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[22]  Kim-Chuan Toh,et al.  SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .

[23]  Asuman E. Ozdaglar,et al.  Separable and low-rank continuous games , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[24]  L. Shapley,et al.  Geometry of Moment Spaces , 1953 .

[25]  Pablo A. Parrilo,et al.  Polynomial stochastic games via sum of squares optimization , 2007, 2007 46th IEEE Conference on Decision and Control.

[26]  I. Glicksberg A FURTHER GENERALIZATION OF THE KAKUTANI FIXED POINT THEOREM, WITH APPLICATION TO NASH EQUILIBRIUM POINTS , 1952 .

[27]  L. Devroye Non-Uniform Random Variate Generation , 1986 .