On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games

We propose local symplectic surgery, a two-timescale procedure for finding local Nash equilibria in two-player zero-sum games. We first show that previous gradient-based algorithms cannot guarantee convergence to local Nash equilibria due to the existence of non-Nash stationary points. By taking advantage of the differential structure of the game, we construct an algorithm for which the local Nash equilibria are the only attracting fixed points. We also show that the algorithm exhibits no oscillatory behaviors in neighborhoods of equilibria and show that it has the same per-iteration complexity as other recently proposed algorithms. We conclude by validating the algorithm on two numerical examples: a toy example with multiple Nash equilibria and a non-Nash equilibrium, and the training of a small generative adversarial network (GAN).

[1]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[2]  M. Hirsch,et al.  Dynamics of Morse-Smale urn processes , 1995, Ergodic Theory and Dynamical Systems.

[3]  M. Benaïm Dynamics of stochastic approximation algorithms , 1999 .

[4]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[5]  M. Hirsch,et al.  Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games , 1999 .

[6]  Bikramjit Banerjee,et al.  Adaptive policy gradient in multiagent learning , 2003, AAMAS '03.

[7]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[8]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[9]  M. Bena Learning Processes, Mixed Equilibria and Dynamical Systems Arising from Repeated Games , 2007 .

[10]  V. Borkar Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .

[11]  Tim Roughgarden,et al.  Algorithmic Game Theory , 2007 .

[12]  Liu Yang,et al.  Active Learning with a Drifting Distribution , 2011, NIPS.

[13]  Cars H. Hommes,et al.  Multiple equilibria and limit cycles in evolutionary games with Logit Dynamics , 2012, Games Econ. Behav..

[14]  S. Shankar Sastry,et al.  Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[16]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[17]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[18]  Shimon Whiteson,et al.  Learning with Opponent-Learning Awareness , 2017, AAMAS.

[19]  Michael I. Jordan,et al.  Covariances, Robustness, and Variational Bayes , 2017, J. Mach. Learn. Res..

[20]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[21]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[22]  Christos H. Papadimitriou,et al.  Cycles in adversarial regularized learning , 2017, SODA.

[23]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[24]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[25]  Michael I. Jordan,et al.  On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[26]  Michael I. Jordan,et al.  Artificial Intelligence—The Revolution Hasn’t Happened Yet , 2019, Issue 1.

[27]  Ian Goodfellow,et al.  Generative adversarial networks , 2020, Commun. ACM.