论文信息 - Efficient Algorithms for Smooth Minimax Optimization - 字舞流文

Efficient Algorithms for Smooth Minimax Optimization

This paper studies first order methods for solving smooth minimax optimization problems $\min_x \max_y g(x,y)$ where $g(\cdot,\cdot)$ is smooth and $g(x,\cdot)$ is concave for each $x$. In terms of $g(\cdot,y)$, we consider two settings -- strongly convex and nonconvex -- and improve upon the best known rates in both. For strongly-convex $g(\cdot, y),\ \forall y$, we propose a new algorithm combining Mirror-Prox and Nesterov's AGD, and show that it can find global optimum in $\tilde{O}(1/k^2)$ iterations, improving over current state-of-the-art rate of $O(1/k)$. We use this result along with an inexact proximal point method to provide $\tilde{O}(1/k^{1/3})$ rate for finding stationary points in the nonconvex setting where $g(\cdot, y)$ can be nonconvex. This improves over current best-known rate of $O(1/k^{1/5})$. Finally, we instantiate our result for finite nonconvex minimax problems, i.e., $\min_x \max_{1\leq i\leq m} f_i(x)$, with nonconvex $f_i(\cdot)$, to obtain convergence rate of $O(m(\log m)^{3/2}/k^{1/3})$ total gradient evaluations for finding a stationary point.

Prateek Jain | Praneeth Netrapalli | Sewoong Oh | Kiran Koshy Thekumparampil | Sewoong Oh | Prateek Jain | Praneeth Netrapalli | K. K. Thekumparampil

[1] Antonin Chambolle,et al. On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[2] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[3] M. Dufwenberg. Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.

[4] D. Kinderlehrer,et al. An introduction to variational inequalities and their applications , 1980 .

[5] Arkadi Nemirovski,et al. Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[6] Antonin Chambolle,et al. An introduction to continuous optimization for imaging , 2016, Acta Numerica.

[7] Yangyang Xu,et al. Accelerated primal–dual proximal block coordinate updating methods for constrained convex optimization , 2017, Comput. Optim. Appl..

[8] Dimitri P. Bertsekas,et al. On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[9] Jean-Yves Audibert. Optimization for Machine Learning , 1995 .

[10] A. Gasnikov,et al. Accelerated methods for composite non-bilinear saddle point problem , 2019, 1906.03620.

[11] Yangyang Xu,et al. Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming , 2017, Mathematical Programming.

[12] A. Kruger. On Fréchet Subdifferentials , 2003 .

[13] M. Sion. On general minimax theorems , 1958 .

[14] Wei Hu,et al. Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity , 2018, AISTATS.

[15] Yunmei Chen,et al. Accelerated schemes for a class of variational inequalities , 2014, Mathematical Programming.

[16] Nikhil Bansal,et al. Potential-Function Proofs for First-Order Methods , 2017, ArXiv.

[17] Angelia Nedic,et al. Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[18] Dimitri P. Bertsekas,et al. Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[19] Mingrui Liu,et al. Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[20] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .

[21] J. Lee,et al. Convergence to Second-Order Stationarity for Constrained Non-Convex Optimization , 2018, 1810.02024.

[22] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[23] Yangyang Xu,et al. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems , 2018, Math. Program..

[24] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..

[25] Prateek Jain,et al. Surrogate Functions for Maximizing Precision at the Top , 2015, ICML.

[26] Yurii Nesterov,et al. Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[27] H. Komiya. Elementary proof for Sion's minimax theorem , 1988 .

[28] Michael I. Jordan,et al. What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[29] Dmitriy Drusvyatskiy,et al. Stochastic subgradient method converges at the rate O(k-1/4) on weakly convex functions , 2018, ArXiv.

[30] Richard G. Baraniuk,et al. Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[31] N. S. Aybat,et al. A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems , 2018, 1803.01401.

[32] Panos M. Pardalos,et al. Convex optimization theory , 2010, Optim. Methods Softw..

[33] Yunmei Chen,et al. Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[34] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[35] Yongxin Chen,et al. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[36] Akiko Takeda,et al. Nonconvex Optimization for Regression with Fairness Constraints , 2018, ICML.

[37] Weiwei Kong,et al. An Accelerated Inexact Proximal Point Method for Solving Nonconvex-Concave Min-Max Problems , 2019, SIAM J. Optim..

[38] Ronald E. Bruck. On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space , 1977 .

[39] Jason D. Lee,et al. On the Convergence and Robustness of Training GANs with Regularized Optimal Transport , 2018, NeurIPS.

[40] Zhipeng Xie,et al. Accelerated Primal Dual Method for a Class of Saddle Point Problem with Strongly Convex Component , 2019, 1906.07691.

[41] S. Kakade,et al. On the duality of strong convexity and strong smoothness : Learning applications and matrix regularization , 2009 .

[42] P. Tseng. On linear convergence of iterative methods for the variational inequality problem , 1995 .

[43] Aryan Mokhtari,et al. A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[44] Michael I. Jordan,et al. Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal , 2019, ArXiv.

[45] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[46] A. Juditsky,et al. 5 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , I : General Purpose Methods , 2010 .

[47] Renato D. C. Monteiro,et al. An Accelerated HPE-Type Algorithm for a Class of Composite Convex-Concave Saddle-Point Problems , 2016, SIAM J. Optim..