Near-Optimal Algorithms for Minimax Optimization

This paper resolves a longstanding open question pertaining to the design of near-optimal first-order algorithms for smooth and strongly-convex-strongly-concave minimax problems. Current state-of-the-art first-order algorithms find an approximate Nash equilibrium using $\tilde{O}(\kappa_{\mathbf x}+\kappa_{\mathbf y})$ or $\tilde{O}(\min\{\kappa_{\mathbf x}\sqrt{\kappa_{\mathbf y}}, \sqrt{\kappa_{\mathbf x}}\kappa_{\mathbf y}\})$ gradient evaluations, where $\kappa_{\mathbf x}$ and $\kappa_{\mathbf y}$ are the condition numbers for the strong-convexity and strong-concavity assumptions. A gap still remains between these results and the best existing lower bound $\tilde{\Omega}(\sqrt{\kappa_{\mathbf x}\kappa_{\mathbf y}})$. This paper presents the first algorithm with $\tilde{O}(\sqrt{\kappa_{\mathbf x}\kappa_{\mathbf y}})$ gradient complexity, matching the lower bound up to logarithmic factors. Our algorithm is designed based on an accelerated proximal point method and an accelerated solver for minimax proximal steps. It can be easily extended to the settings of strongly-convex-concave, convex-concave, nonconvex-strongly-concave, and nonconvex-concave functions. This paper also presents algorithms that match or outperform all existing methods in these settings in terms of gradient complexity, up to logarithmic factors.

[1]  Michael I. Jordan,et al.  On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[2]  Renato D. C. Monteiro,et al.  On the Complexity of the Hybrid Proximal Extragradient Method for the Iterates and the Ergodic Mean , 2010, SIAM J. Optim..

[3]  Osman Güler,et al.  New Proximal Point Algorithms for Convex Minimization , 1992, SIAM J. Optim..

[4]  Yangyang Xu,et al.  Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming , 2017, Mathematical Programming.

[5]  Yunmei Chen,et al.  Accelerated schemes for a class of variational inequalities , 2014, Mathematical Programming.

[6]  Panayotis Mertikopoulos,et al.  On the convergence of single-call stochastic extra-gradient methods , 2019, NeurIPS.

[7]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[8]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[9]  N. S. Aybat,et al.  A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems , 2018, 1803.01401.

[10]  Zhipeng Xie,et al.  Accelerated Primal Dual Method for a Class of Saddle Point Problem with Strongly Convex Component , 2019, 1906.07691.

[11]  Jacob Abernethy,et al.  Last-iterate convergence rates for min-max optimization , 2019, ArXiv.

[12]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[13]  Gonzalo Mateos,et al.  Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[14]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[15]  Yangyang Xu,et al.  Accelerated primal–dual proximal block coordinate updating methods for constrained convex optimization , 2017, Comput. Optim. Appl..

[16]  Tengyuan Liang,et al.  Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks , 2018, AISTATS.

[17]  Yu. V. Malitsky,et al.  Projected Reflected Gradient Methods for Monotone Variational Inequalities , 2015, SIAM J. Optim..

[18]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[19]  Mingrui Liu,et al.  Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[20]  Marc Teboulle,et al.  Interior projection-like methods for monotone variational inequalities , 2005, Math. Program..

[21]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[22]  Shuzhong Zhang,et al.  On lower iteration complexity bounds for the convex concave saddle point problems , 2019, Math. Program..

[23]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[24]  Yair Carmon,et al.  Lower bounds for finding stationary points II: first-order methods , 2017, Mathematical Programming.

[25]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[26]  Tim Roughgarden,et al.  Algorithmic Game Theory , 2007 .

[27]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[28]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[29]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[30]  Yurii Nesterov,et al.  Solving Strongly Monotone Variational and Quasi-Variational Inequalities , 2006 .

[31]  J. Neumann,et al.  Theory of Games and Economic Behavior. , 1945 .

[32]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[33]  Prateek Jain,et al.  Efficient Algorithms for Smooth Minimax Optimization , 2019, NeurIPS.

[34]  John C. Duchi,et al.  Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences , 2016, NIPS.

[35]  Aryan Mokhtari,et al.  Proximal Point Approximations Achieving a Convergence Rate of O(1/k) for Smooth Convex-Concave Saddle Point Problems: Optimistic Gradient and Extra-gradient Methods , 2019, ArXiv.

[36]  Dmitriy Drusvyatskiy,et al.  Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[37]  M. Sion On general minimax theorems , 1958 .

[38]  Tony F. Chan,et al.  A General Framework for a Class of First Order Primal-Dual Algorithms for Convex Optimization in Imaging Science , 2010, SIAM J. Imaging Sci..

[39]  Jeff S. Shamma,et al.  Cooperative Control of Distributed Multi-Agent Systems , 2008 .

[40]  Renato D. C. Monteiro,et al.  Iteration-complexity of first-order augmented Lagrangian methods for convex programming , 2015, Mathematical Programming.

[41]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[42]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[43]  Renato D. C. Monteiro,et al.  Complexity of Variants of Tseng's Modified F-B Splitting and Korpelevich's Methods for Hemivariational Inequalities with Applications to Saddle-point and Convex Optimization Problems , 2011, SIAM J. Optim..

[44]  Yongxin Chen,et al.  Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[45]  Jason D. Lee,et al.  Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition , 2018, ArXiv.

[46]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[47]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[48]  Renato D. C. Monteiro,et al.  An accelerated non-Euclidean hybrid proximal extragradient-type algorithm for convex–concave saddle-point problems , 2017, Optim. Methods Softw..

[49]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[50]  Weiwei Kong,et al.  An Accelerated Inexact Proximal Point Method for Solving Nonconvex-Concave Min-Max Problems , 2019, SIAM J. Optim..

[51]  Yair Carmon,et al.  Lower bounds for finding stationary points I , 2017, Mathematical Programming.

[52]  Michael I. Jordan,et al.  Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal , 2019, ArXiv.

[53]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[54]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[55]  Yunmei Chen,et al.  Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[56]  Meisam Razaviyayn,et al.  Efficient Search of First-Order Nash Equilibria in Nonconvex-Concave Smooth Min-Max Problems , 2021, SIAM J. Optim..

[57]  Zheng Xu,et al.  Stabilizing Adversarial Nets With Prediction Methods , 2017, ICLR.

[58]  Nesa L'abbe Wu,et al.  Linear programming and extensions , 1981 .

[59]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[60]  Renbo Zhao,et al.  A Primal Dual Smoothing Framework for Max-Structured Nonconvex Optimization , 2020, 2003.04375.

[61]  Yurii Nesterov,et al.  Dual extrapolation and its applications to solving variational inequalities and related problems , 2003, Math. Program..

[62]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[63]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[64]  Yunmei Chen,et al.  An Accelerated Linearized Alternating Direction Method of Multipliers , 2014, SIAM J. Imaging Sci..

[65]  Katta G. Murty,et al.  Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[66]  A. Gasnikov,et al.  Accelerated methods for composite non-bilinear saddle point problem , 2019, 1906.03620.

[67]  Ioannis Mitliagkas,et al.  Lower Bounds and Conditioning of Differentiable Games , 2019, ArXiv.

[68]  Tatjana Chavdarova,et al.  Reducing Noise in GAN Training with Variance Reduced Extragradient , 2019, NeurIPS.

[69]  Antonin Chambolle,et al.  On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[70]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[71]  Jason D. Lee,et al.  Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[72]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[73]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[74]  Renato D. C. Monteiro,et al.  An Accelerated HPE-Type Algorithm for a Class of Composite Convex-Concave Saddle-Point Problems , 2016, SIAM J. Optim..

[75]  Peter Richtárik,et al.  Revisiting Stochastic Extragradient , 2019, AISTATS.

[76]  Ioannis Mitliagkas,et al.  Accelerating Smooth Games by Manipulating Spectral Shapes , 2020, AISTATS.