On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games

We propose local symplectic surgery, a two-timescale procedure for finding local Nash equilibria in two-player zero-sum games. We first show that previous gradient-based algorithms cannot guarantee convergence to local Nash equilibria due to the existence of non-Nash stationary points. By taking advantage of the differential structure of the game, we construct an algorithm for which the local Nash equilibria are the only attracting fixed points. We also show that the algorithm exhibits no oscillatory behaviors in neighborhoods of equilibria and show that it has the same per-iteration complexity as other recently proposed algorithms. We conclude by validating the algorithm on two numerical examples: a toy example with multiple Nash equilibria and a non-Nash equilibrium, and the training of a small generative adversarial network (GAN).

[1]  Michael I. Jordan,et al.  Artificial Intelligence—The Revolution Hasn’t Happened Yet , 2019, Issue 1.

[2]  Eric V. Mazumdar,et al.  On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, ArXiv.

[3]  P. Mertikopoulos,et al.  Mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ArXiv.

[4]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[5]  Shimon Whiteson,et al.  Learning with Opponent-Learning Awareness , 2017, AAMAS.

[6]  Christos H. Papadimitriou,et al.  Cycles in adversarial regularized learning , 2017, SODA.

[7]  Michael I. Jordan,et al.  Covariances, Robustness, and Variational Bayes , 2017, J. Mach. Learn. Res..

[8]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[9]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[10]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[11]  S. Shankar Sastry,et al.  Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[13]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[14]  G. Lugosi,et al.  Prediction, learning, and games , 2006 .

[15]  Bikramjit Banerjee,et al.  Adaptive policy gradient in multiagent learning , 2003, AAMAS '03.

[16]  M. Hirsch,et al.  Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games , 1999 .

[17]  M. Hirsch,et al.  Dynamics of Morse-Smale urn processes , 1995, Ergodic Theory and Dynamical Systems.

[18]  Pascal Vincent,et al.  A Variational Inequality Perspective on Generative Adversarial Nets , 2018, ArXiv.

[19]  George Rajna,et al.  Artificial Intelligence Revolution , 2017 .

[20]  Cars H. Hommes,et al.  Multiple equilibria and limit cycles in evolutionary games with Logit Dynamics , 2012, Games Econ. Behav..

[21]  M. Bena Learning Processes, Mixed Equilibria and Dynamical Systems Arising from Repeated Games , 2007 .

[22]  N. Nisan,et al.  Algorithmic Game Theory , 2007 .

[23]  Kevin M. Passino,et al.  of nonlinear systems , 2006 .

[24]  M. Benaïm Dynamics of stochastic approximation algorithms , 1999 .

[25]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .