Near-Optimal Algorithms for Minimax Optimization

This paper resolves a longstanding open question pertaining to the design of near-optimal first-order algorithms for smooth and strongly-convex-strongly-concave minimax problems. Current state-of-the-art first-order algorithms find an approximate Nash equilibrium using $\tilde{O}(\kappa_{\mathbf x}+\kappa_{\mathbf y})$ or $\tilde{O}(\min\{\kappa_{\mathbf x}\sqrt{\kappa_{\mathbf y}}, \sqrt{\kappa_{\mathbf x}}\kappa_{\mathbf y}\})$ gradient evaluations, where $\kappa_{\mathbf x}$ and $\kappa_{\mathbf y}$ are the condition numbers for the strong-convexity and strong-concavity assumptions. A gap still remains between these results and the best existing lower bound $\tilde{\Omega}(\sqrt{\kappa_{\mathbf x}\kappa_{\mathbf y}})$. This paper presents the first algorithm with $\tilde{O}(\sqrt{\kappa_{\mathbf x}\kappa_{\mathbf y}})$ gradient complexity, matching the lower bound up to logarithmic factors. Our algorithm is designed based on an accelerated proximal point method and an accelerated solver for minimax proximal steps. It can be easily extended to the settings of strongly-convex-concave, convex-concave, nonconvex-strongly-concave, and nonconvex-concave functions. This paper also presents algorithms that match or outperform all existing methods in these settings in terms of gradient complexity, up to logarithmic factors.

[1]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[2]  Michael I. Jordan,et al.  Artificial Intelligence—The Revolution Hasn’t Happened Yet , 2019, Issue 1.

[3]  A. Gasnikov,et al.  Accelerated methods for composite non-bilinear saddle point problem , 2019, 1906.03620.

[4]  Yunmei Chen,et al.  Accelerated schemes for a class of variational inequalities , 2014, Mathematical Programming.

[5]  Jason D. Lee,et al.  Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition , 2018, ArXiv.

[6]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[7]  Michael I. Jordan,et al.  Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal , 2019, ArXiv.

[8]  Yunmei Chen,et al.  Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[9]  Prateek Jain,et al.  Efficient Algorithms for Smooth Minimax Optimization , 2019, NeurIPS.

[10]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[11]  Yunmei Chen,et al.  An Accelerated Linearized Alternating Direction Method of Multipliers , 2014, SIAM J. Imaging Sci..

[12]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[15]  Antonin Chambolle,et al.  On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[16]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[17]  Renato D. C. Monteiro,et al.  An Accelerated HPE-Type Algorithm for a Class of Composite Convex-Concave Saddle-Point Problems , 2016, SIAM J. Optim..

[18]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[19]  Panayotis Mertikopoulos,et al.  On the convergence of single-call stochastic extra-gradient methods , 2019, NeurIPS.

[20]  Renato D. C. Monteiro,et al.  Iteration-complexity of first-order augmented Lagrangian methods for convex programming , 2015, Mathematical Programming.

[21]  Weiwei Kong,et al.  An Accelerated Inexact Proximal Point Method for Solving Nonconvex-Concave Min-Max Problems , 2019, SIAM J. Optim..

[22]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[23]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[24]  Tim Roughgarden,et al.  Algorithmic Game Theory , 2007 .

[25]  Jeff S. Shamma,et al.  Cooperative Control of Distributed Multi-Agent Systems , 2008 .

[26]  Tony F. Chan,et al.  A General Framework for a Class of First Order Primal-Dual Algorithms for Convex Optimization in Imaging Science , 2010, SIAM J. Imaging Sci..

[27]  Renato D. C. Monteiro,et al.  An accelerated non-Euclidean hybrid proximal extragradient-type algorithm for convex–concave saddle-point problems , 2017, Optim. Methods Softw..

[28]  Nesa L'abbe Wu,et al.  Linear programming and extensions , 1981 .

[29]  N. S. Aybat,et al.  A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems , 2018, 1803.01401.

[30]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[31]  Gonzalo Mateos,et al.  Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[32]  Jacob Abernethy,et al.  Last-iterate convergence rates for min-max optimization , 2019, ALT.

[33]  Yair Carmon,et al.  Lower bounds for finding stationary points I , 2017, Mathematical Programming.

[34]  Yangyang Xu,et al.  Accelerated primal–dual proximal block coordinate updating methods for constrained convex optimization , 2017, Comput. Optim. Appl..

[35]  M. Sion On general minimax theorems , 1958 .

[36]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[37]  Renato D. C. Monteiro,et al.  On the Complexity of the Hybrid Proximal Extragradient Method for the Iterates and the Ergodic Mean , 2010, SIAM J. Optim..

[38]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[39]  Zhipeng Xie,et al.  Accelerated Primal Dual Method for a Class of Saddle Point Problem with Strongly Convex Component , 2019, 1906.07691.

[40]  A Primal Dual Smoothing Framework for Max-Structured Nonconvex Optimization , 2020, 2003.04375.

[41]  Peter Richtárik,et al.  Revisiting Stochastic Extragradient , 2019, AISTATS.

[42]  Yair Carmon,et al.  Lower bounds for finding stationary points II: first-order methods , 2017, Mathematical Programming.

[43]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[44]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[45]  Tatjana Chavdarova,et al.  Reducing Noise in GAN Training with Variance Reduced Extragradient , 2019, NeurIPS.

[46]  Aryan Mokhtari,et al.  Proximal Point Approximations Achieving a Convergence Rate of O(1/k) for Smooth Convex-Concave Saddle Point Problems: Optimistic Gradient and Extra-gradient Methods , 2019, ArXiv.

[47]  Yangyang Xu,et al.  Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming , 2017, Mathematical Programming.

[48]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[49]  Mingrui Liu,et al.  Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[50]  Renato D. C. Monteiro,et al.  Complexity of Variants of Tseng's Modified F-B Splitting and Korpelevich's Methods for Hemivariational Inequalities with Applications to Saddle-point and Convex Optimization Problems , 2011, SIAM J. Optim..

[51]  Meisam Razaviyayn,et al.  Efficient Search of First-Order Nash Equilibria in Nonconvex-Concave Smooth Min-Max Problems , 2021, SIAM J. Optim..

[52]  Ioannis Mitliagkas,et al.  Accelerating Smooth Games by Manipulating Spectral Shapes , 2020, AISTATS.

[53]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[54]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[55]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[56]  Michael I. Jordan,et al.  On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[57]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[58]  Jason D. Lee,et al.  Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[59]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[60]  Yurii Nesterov,et al.  Dual extrapolation and its applications to solving variational inequalities and related problems , 2003, Math. Program..

[61]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[62]  Zheng Xu,et al.  Stabilizing Adversarial Nets With Prediction Methods , 2017, ICLR.

[63]  Yu. V. Malitsky,et al.  Projected Reflected Gradient Methods for Monotone Variational Inequalities , 2015, SIAM J. Optim..

[64]  Yongxin Chen,et al.  Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[65]  Ioannis Mitliagkas,et al.  Lower Bounds and Conditioning of Differentiable Games , 2019, ArXiv.

[66]  Yurii Nesterov,et al.  Solving Strongly Monotone Variational and Quasi-Variational Inequalities , 2006 .

[67]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[68]  Renbo Zhao Optimal Algorithms for Stochastic Three-Composite Convex-Concave Saddle Point Problems , 2019, 1903.01687.

[69]  Marc Teboulle,et al.  Interior projection-like methods for monotone variational inequalities , 2005, Math. Program..

[70]  John C. Duchi,et al.  Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences , 2016, NIPS.

[71]  Shuzhong Zhang,et al.  On lower iteration complexity bounds for the convex concave saddle point problems , 2019, Math. Program..

[72]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[73]  Tengyuan Liang,et al.  Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks , 2018, AISTATS.

[74]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[75]  Dmitriy Drusvyatskiy,et al.  Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[76]  Katta G. Murty,et al.  Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[77]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[78]  Osman Güler,et al.  New Proximal Point Algorithms for Convex Minimization , 1992, SIAM J. Optim..