Efficient Algorithms for Smooth Minimax Optimization

This paper studies first order methods for solving smooth minimax optimization problems $\min_x \max_y g(x,y)$ where $g(\cdot,\cdot)$ is smooth and $g(x,\cdot)$ is concave for each $x$. In terms of $g(\cdot,y)$, we consider two settings -- strongly convex and nonconvex -- and improve upon the best known rates in both. For strongly-convex $g(\cdot, y),\ \forall y$, we propose a new algorithm combining Mirror-Prox and Nesterov's AGD, and show that it can find global optimum in $\tilde{O}(1/k^2)$ iterations, improving over current state-of-the-art rate of $O(1/k)$. We use this result along with an inexact proximal point method to provide $\tilde{O}(1/k^{1/3})$ rate for finding stationary points in the nonconvex setting where $g(\cdot, y)$ can be nonconvex. This improves over current best-known rate of $O(1/k^{1/5})$. Finally, we instantiate our result for finite nonconvex minimax problems, i.e., $\min_x \max_{1\leq i\leq m} f_i(x)$, with nonconvex $f_i(\cdot)$, to obtain convergence rate of $O(m(\log m)^{3/2}/k^{1/3})$ total gradient evaluations for finding a stationary point.

[1]  Antonin Chambolle,et al.  On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[2]  Jason D. Lee,et al.  Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[3]  M. Dufwenberg Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.

[4]  D. Kinderlehrer,et al.  An introduction to variational inequalities and their applications , 1980 .

[5]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[6]  Antonin Chambolle,et al.  An introduction to continuous optimization for imaging , 2016, Acta Numerica.

[7]  Yangyang Xu,et al.  Accelerated primal–dual proximal block coordinate updating methods for constrained convex optimization , 2017, Comput. Optim. Appl..

[8]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[9]  Jean-Yves Audibert Optimization for Machine Learning , 1995 .

[10]  A. Gasnikov,et al.  Accelerated methods for composite non-bilinear saddle point problem , 2019, 1906.03620.

[11]  Yangyang Xu,et al.  Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming , 2017, Mathematical Programming.

[12]  A. Kruger On Fréchet Subdifferentials , 2003 .

[13]  M. Sion On general minimax theorems , 1958 .

[14]  Wei Hu,et al.  Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity , 2018, AISTATS.

[15]  Yunmei Chen,et al.  Accelerated schemes for a class of variational inequalities , 2014, Mathematical Programming.

[16]  Nikhil Bansal,et al.  Potential-Function Proofs for First-Order Methods , 2017, ArXiv.

[17]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[18]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[19]  Mingrui Liu,et al.  Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[20]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[21]  J. Lee,et al.  Convergence to Second-Order Stationarity for Constrained Non-Convex Optimization , 2018, 1810.02024.

[22]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[23]  Yangyang Xu,et al.  Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems , 2018, Math. Program..

[24]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[25]  Prateek Jain,et al.  Surrogate Functions for Maximizing Precision at the Top , 2015, ICML.

[26]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[27]  H. Komiya Elementary proof for Sion's minimax theorem , 1988 .

[28]  Michael I. Jordan,et al.  What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[29]  Dmitriy Drusvyatskiy,et al.  Stochastic subgradient method converges at the rate O(k-1/4) on weakly convex functions , 2018, ArXiv.

[30]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[31]  N. S. Aybat,et al.  A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems , 2018, 1803.01401.

[32]  Panos M. Pardalos,et al.  Convex optimization theory , 2010, Optim. Methods Softw..

[33]  Yunmei Chen,et al.  Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[34]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[35]  Yongxin Chen,et al.  Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[36]  Akiko Takeda,et al.  Nonconvex Optimization for Regression with Fairness Constraints , 2018, ICML.

[37]  Weiwei Kong,et al.  An Accelerated Inexact Proximal Point Method for Solving Nonconvex-Concave Min-Max Problems , 2019, SIAM J. Optim..

[38]  Ronald E. Bruck On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space , 1977 .

[39]  Jason D. Lee,et al.  On the Convergence and Robustness of Training GANs with Regularized Optimal Transport , 2018, NeurIPS.

[40]  Zhipeng Xie,et al.  Accelerated Primal Dual Method for a Class of Saddle Point Problem with Strongly Convex Component , 2019, 1906.07691.

[41]  S. Kakade,et al.  On the duality of strong convexity and strong smoothness : Learning applications and matrix regularization , 2009 .

[42]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[43]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[44]  Michael I. Jordan,et al.  Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal , 2019, ArXiv.

[45]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[46]  A. Juditsky,et al.  5 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , I : General Purpose Methods , 2010 .

[47]  Renato D. C. Monteiro,et al.  An Accelerated HPE-Type Algorithm for a Class of Composite Convex-Concave Saddle-Point Problems , 2016, SIAM J. Optim..