论文信息 - Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation

Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation

We study the role that a finite timescale separation parameter $\tau$ has on gradient descent-ascent in two-player non-convex, non-concave zero-sum games where the learning rate of player 1 is denoted by $\gamma_1$ and the learning rate of player 2 is defined to be $\gamma_2=\tau\gamma_1$. Existing work analyzing the role of timescale separation in gradient descent-ascent has primarily focused on the edge cases of players sharing a learning rate ($\tau =1$) and the maximizing player approximately converging between each update of the minimizing player ($\tau \rightarrow \infty$). For the parameter choice of $\tau=1$, it is known that the learning dynamics are not guaranteed to converge to a game-theoretically meaningful equilibria in general. In contrast, Jin et al. (2020) showed that the stable critical points of gradient descent-ascent coincide with the set of strict local minmax equilibria as $\tau\rightarrow\infty$. In this work, we bridge the gap between past work by showing there exists a finite timescale separation parameter $\tau^{\ast}$ such that $x^{\ast}$ is a stable critical point of gradient descent-ascent for all $\tau \in (\tau^{\ast}, \infty)$ if and only if it is a strict local minmax equilibrium. Moreover, we provide an explicit construction for computing $\tau^{\ast}$ along with corresponding convergence rates and results under deterministic and stochastic gradient feedback. The convergence results we present are complemented by a non-convergence result: given a critical point $x^{\ast}$ that is not a strict local minmax equilibrium, then there exists a finite timescale separation $\tau_0$ such that $x^{\ast}$ is unstable for all $\tau\in (\tau_0, \infty)$. Finally, we empirically demonstrate on the CIFAR-10 and CelebA datasets the significant impact timescale separation has on training performance.

Tanner Fiez | Lillian Ratliff | L. Ratliff | Tanner Fiez

[1] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .

[2] Thomas Hofmann,et al. Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[3] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[4] Lillian J. Ratliff,et al. Convergence of Learning Dynamics in Stackelberg Games , 2019, ArXiv.

[5] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[6] Roy M. Howard,et al. Linear System Theory , 1992 .

[7] Eyad H. Abed,et al. Generalized Stability of Linear Singularly Perturbed Systems Including Calculation of Maximal Parameter Range , 1990 .

[8] Constantinos Daskalakis,et al. The complexity of constrained min-max optimization , 2020, STOC.

[9] Meisam Razaviyayn,et al. Efficient Search of First-Order Nash Equilibria in Nonconvex-Concave Smooth Min-Max Problems , 2021, SIAM J. Optim..

[10] Lahcen Saydy. New stability/performance results for singularly perturbed systems , 1996, Autom..

[11] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[12] Michael I. Jordan,et al. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[13] Willy Govaerts,et al. Numerical methods for bifurcations of dynamical equilibria , 1987 .

[14] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.

[15] C. Tretter. Spectral Theory Of Block Operator Matrices And Applications , 2008 .

[16] S. Shankar Sastry,et al. On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[17] Sebastian Nowozin,et al. Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.

[18] João Pedro Hespanha,et al. Linear Systems Theory , 2009 .

[19] ASHISH CHERUKURI,et al. Saddle-Point Dynamics: Conditions for Asymptotic Stability of Saddle Points , 2015, SIAM J. Control. Optim..

[20] David Duvenaud,et al. Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[21] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[22] Sebastian Nowozin,et al. Which Training Methods for GANs do actually Converge? , 2018, ICML.

[23] F. Takens,et al. Preliminaries of Dynamical Systems Theory , 2010 .

[24] Michael I. Jordan,et al. What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[25] S. Shankar Sastry,et al. Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[26] Asuman Ozdaglar,et al. Do GANs always have Nash equilibria? , 2020, ICML.

[27] Thore Graepel,et al. Differentiable Game Mechanics , 2019, J. Mach. Learn. Res..

[28] S. Shankar Sastry,et al. On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[29] J. Zico Kolter,et al. Gradient descent GAN optimization is locally stable , 2017, NIPS.

[30] Eyad H. Abed,et al. Guardian maps and the generalized stability of parametrized families of matrices and polynomials , 1990, Math. Control. Signals Syst..

[31] I. Argyros. A generalization of Ostrowski's theorem on fixed points , 1999 .

[32] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.

[33] Francis J. Doyle,et al. Nonlinear systems theory , 1997 .

[34] Yongxin Chen,et al. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[35] Victor R. Lesser,et al. Multi-Agent Learning with Policy Prediction , 2010, AAAI.

[36] Michael I. Jordan,et al. Near-Optimal Algorithms for Minimax Optimization , 2020, COLT.

[37] Roger B. Grosse,et al. Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions , 2019, ICLR.

[38] Mingrui Liu,et al. Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[39] Gauthier Gidel,et al. A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[40] Georgios Piliouras,et al. Game dynamics as the meaning of a game , 2019, SECO.

[41] David Pfau,et al. Unrolled Generative Adversarial Networks , 2016, ICLR.

[42] S. Shankar Sastry,et al. On Gradient-Based Learning in Continuous Games , 2018, SIAM J. Math. Data Sci..

[43] Aravind Rajeswaran,et al. A Game Theoretic Framework for Model Based Reinforcement Learning , 2020, ICML.

[44] Chuan-Sheng Foo,et al. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[45] P. Lancaster,et al. The theory of matrices : with applications , 1985 .

[46] Hassan K. Khalil,et al. Singular perturbation methods in control : analysis and design , 1986 .

[47] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[48] S. Sastry,et al. Jump behavior of circuits and systems , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[49] Pascal Vincent,et al. A Closer Look at the Optimization Landscapes of Generative Adversarial Networks , 2019, ICLR.

[50] Lillian J. Ratliff,et al. Convergence Analysis of Gradient-Based Learning in Continuous Games , 2019, UAI.

[51] Sameer Kamal,et al. On the Convergence, Lock-In Probability, and Sample Complexity of Stochastic Approximation , 2010, SIAM J. Control. Optim..

[52] Ioannis Mitliagkas,et al. Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[53] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[54] Lillian Ratliff,et al. Local Nash Equilibria are Isolated, Strict Local Nash Equilibria in ‘Almost All’ Zero-Sum Continuous Games , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[55] P. Olver. Nonlinear Systems , 2013 .

[56] S. Shankar Sastry,et al. Genericity and structural stability of non-degenerate differential Nash equilibria , 2014, 2014 American Control Conference.

[57] R. G. Casten,et al. Basic Concepts Underlying Singular Perturbation Techniques , 1972 .

[58] Tanner Fiez,et al. Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study , 2020, ICML.

[59] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.

[60] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.

[61] Constantinos Daskalakis,et al. The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[62] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[63] Shimon Whiteson,et al. Stable Opponent Shaping in Differentiable Games , 2018, ICLR.

[64] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[65] James M. Ortega,et al. Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[66] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.

[67] M. Benaïm. A Dynamical System Approach to Stochastic Approximations , 1996 .

[68] D. Mustafa,et al. Generalized integral controllability , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[69] Guodong Zhang,et al. On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach , 2019, ICLR.

[70] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[71] John M Alongi,et al. Recurrence and Topology , 2007 .