论文信息 - On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems - 字舞流文

On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems

We consider nonconvex-concave minimax problems, $\min_{\mathbf{x}} \max_{\mathbf{y} \in \mathcal{Y}} f(\mathbf{x}, \mathbf{y})$, where $f$ is nonconvex in $\mathbf{x}$ but concave in $\mathbf{y}$ and $\mathcal{Y}$ is a convex and bounded set. One of the most popular algorithms for solving this problem is the celebrated gradient descent ascent (GDA) algorithm, which has been widely used in machine learning, control theory and economics. Despite the extensive convergence results for the convex-concave setting, GDA with equal stepsize can converge to limit cycles or even diverge in a general setting. In this paper, we present the complexity results on two-time-scale GDA for solving nonconvex-concave minimax problems, showing that the algorithm can find a stationary point of the function $\Phi(\cdot) := \max_{\mathbf{y} \in \mathcal{Y}} f(\cdot, \mathbf{y})$ efficiently. To the best our knowledge, this is the first nonasymptotic analysis for two-time-scale GDA in this setting, shedding light on its superior practical performance in training generative adversarial networks (GANs) and other real applications.

Michael I. Jordan | Tianyi Lin | Chi Jin | Chi Jin | Tianyi Lin

[1] Prateek Jain,et al. Efficient Algorithms for Smooth Minimax Optimization , 2019, NeurIPS.

[2] Marios M. Polycarpou,et al. Cooperative Control of Distributed Multi-Agent Systems , 2001 .

[3] Wei Hu,et al. Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity , 2018, AISTATS.

[4] Aryan Mokhtari,et al. Proximal Point Approximations Achieving a Convergence Rate of O(1/k) for Smooth Convex-Concave Saddle Point Problems: Optimistic Gradient and Extra-gradient Methods , 2019, ArXiv.

[5] 丸山徹. Convex Analysisの二,三の進展について , 1977 .

[6] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[7] Michael I. Jordan,et al. Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal , 2019, ArXiv.

[8] Yongxin Chen,et al. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[9] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[10] S. Shankar Sastry,et al. On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[11] ASHISH CHERUKURI,et al. Saddle-Point Dynamics: Conditions for Asymptotic Stability of Saddle Points , 2015, SIAM J. Control. Optim..

[12] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[13] J. Neumann. Zur Theorie der Gesellschaftsspiele , 1928 .

[14] Weiwei Kong,et al. An Accelerated Inexact Proximal Point Method for Solving Nonconvex-Concave Min-Max Problems , 2019, SIAM J. Optim..

[15] Shie Mannor,et al. Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[16] Mingrui Liu,et al. Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[17] R. Tyrrell Rockafellar,et al. Convergence Rates in Forward-Backward Splitting , 1997, SIAM J. Optim..

[18] Aryan Mokhtari,et al. A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[19] Tengyuan Liang,et al. Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks , 2018, AISTATS.

[20] Andreas Krause,et al. An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[21] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[22] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[23] G. M. Korpelevich. The extragradient method for finding saddle points and other problems , 1976 .

[24] John C. Duchi,et al. Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences , 2016, NIPS.

[25] Jason D. Lee,et al. Solving Approximate Wasserstein GANs to Stationarity , 2018, ArXiv.

[26] Chuan-Sheng Foo,et al. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[27] Gauthier Gidel,et al. A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[28] Arkadi Nemirovski,et al. Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[29] John C. Duchi,et al. Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[30] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[31] M. Hirsch,et al. Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games , 1999 .

[32] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[33] Cars H. Hommes,et al. Multiple equilibria and limit cycles in evolutionary games with Logit Dynamics , 2012, Games Econ. Behav..

[34] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.

[35] Constantinos Daskalakis,et al. The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[36] Mingrui Liu,et al. Solving Weakly-Convex-Weakly-Concave Saddle-Point Problems as Weakly-Monotone Variational Inequality , 2018 .

[37] Dmitriy Drusvyatskiy,et al. Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[38] M. Sion. On general minimax theorems , 1958 .

[39] T. Kose. Solutions of Saddle Value Problems by Differential Equations , 1956 .

[40] H. Uzawa,et al. Preference, production, and capital: Iterative methods for concave programming , 1989 .

[41] Christos H. Papadimitriou,et al. Cycles in adversarial regularized learning , 2017, SODA.

[42] Georgios Piliouras,et al. Multiplicative Weights Update in Zero-Sum Games , 2018, EC.

[43] Constantinos Daskalakis,et al. Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization , 2018, ITCS.

[44] Thomas Hofmann,et al. Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[45] Michael I. Jordan,et al. Artificial Intelligence—The Revolution Hasn’t Happened Yet , 2019, Issue 1.

[46] Daniel Kuhn,et al. Distributionally Robust Logistic Regression , 2015, NIPS.

[47] Angelia Nedic,et al. Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[48] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[49] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.

[50] Ioannis Mitliagkas,et al. A Tight and Unified Analysis of Extragradient for a Whole Spectrum of Differentiable Games , 2019, ArXiv.

[51] Dmitriy Drusvyatskiy,et al. Error Bounds, Quadratic Growth, and Linear Convergence of Proximal Methods , 2016, Math. Oper. Res..

[52] Gonzalo Mateos,et al. Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[53] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[54] Michael I. Jordan,et al. Covariances, Robustness, and Variational Bayes , 2017, J. Mach. Learn. Res..

[55] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .

[56] J. Neumann,et al. Theory of Games and Economic Behavior. , 1945 .

[57] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[58] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .

[59] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[60] Jason D. Lee,et al. On the Convergence and Robustness of Training GANs with Regularized Optimal Transport , 2018, NeurIPS.