论文信息 - Sharp Analysis of Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization - 字舞流文

Sharp Analysis of Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization

Epoch gradient descent method (a.k.a. Epoch-GD) proposed by (Hazan and Kale, 2011) was deemed a breakthrough for stochastic strongly convex minimization, which achieves the optimal convergence rate of O(1/T) with T iterative updates for the objective gap. However, its extension to solving stochastic min-max problems with strong convexity and strong concavity still remains open, and it is still unclear whether a fast rate of O(1/T) for the duality gap is achievable for stochastic min-max optimization under strong convexity and strong concavity. Although some recent studies have proposed stochastic algorithms with fast convergence rates for min-max problems, they require additional assumptions about the problem, e.g., smoothness, bi-linear structure, etc. In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumptions about smoothness or its structure. To the best of our knowledge, our result is the first one that shows Epoch-GDA can achieve the fast rate of O(1/T) for the duality gap of general SCSC min-max problems. We emphasize that such generalization of Epoch-GD for strongly convex minimization problems to Epoch-GDA for SCSC min-max problems is non-trivial and requires novel technical analysis. Moreover, we notice that the key lemma can be also used for proving the convergence of Epoch-GDA for weakly-convex strongly-concave min-max problems, leading to the best complexity as well without smoothness or other structural conditions.

Wei Liu | Yi Xu | Tianbao Yang | Yan Yan | Qihang Lin | Tianbao Yang | Yi Xu | Qihang Lin | Yan Yan | Wei Liu

[1] Mingyi Hong,et al. Decomposing Linearly Constrained Nonconvex Problems by a Proximal Primal Dual Approach: Algorithms, Convergence, and Applications , 2016, ArXiv.

[2] Mingyi Hong,et al. Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solutions for Nonconvex Distributed Optimization , 2018, ArXiv.

[3] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[4] Guanghui Lan,et al. Randomized First-Order Methods for Saddle Point Optimization , 2014, 1409.8625.

[5] Francis R. Bach,et al. Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.

[6] Lin Xiao,et al. Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms , 2017, ICML.

[7] Mingrui Liu,et al. Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate , 2018, ICML.

[8] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[9] Yunmei Chen,et al. Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[10] Siwei Lyu,et al. Stochastic Online AUC Maximization , 2016, NIPS.

[11] John C. Duchi,et al. Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences , 2016, NIPS.

[12] Martin J. Wainwright,et al. Information-theoretic lower bounds on the oracle complexity of convex optimization , 2009, NIPS.

[13] Michael I. Jordan,et al. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[14] N. S. Aybat,et al. A Primal-Dual Algorithm with Line Search for General Convex-Concave Saddle Point Problems , 2020, SIAM J. Optim..

[15] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[16] John C. Duchi,et al. Variance-based Regularization with Convex Objectives , 2016, NIPS.

[17] Mingyi Hong,et al. Perturbed proximal primal–dual algorithm for nonconvex nonsmooth optimization , 2019, Math. Program..

[18] Tianbao Yang,et al. Stochastic Primal-Dual Algorithms with Faster Convergence than O(1/√T) for Problems without Bilinear Structure , 2019, ArXiv.

[19] Yuchen Zhang,et al. Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization , 2014, ICML.

[20] Mingrui Liu,et al. Stochastic AUC Maximization with Deep Neural Networks , 2019, ICLR.

[21] Shiqian Ma,et al. Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity , 2018, NeurIPS.

[22] Le Thi Khanh Hien,et al. A primal-dual smoothing gap reduction framework for strongly convex-generally concave saddle point problems , 2017 .

[23] Alexander Shapiro,et al. Validation analysis of mirror descent stochastic approximation method , 2012, Math. Program..

[24] Yongxin Chen,et al. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[25] Wei Hu,et al. Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity , 2018, AISTATS.

[26] Haishan Ye,et al. Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems , 2020, NeurIPS.

[27] Renbo Zhao. Optimal Algorithms for Stochastic Three-Composite Convex-Concave Saddle Point Problems , 2019, 1903.01687.

[28] Elad Hazan,et al. An optimal algorithm for stochastic strongly-convex optimization , 2010, 1006.2425.

[29] L. Hien,et al. An Inexact Primal-Dual Smoothing Framework for Large-Scale Non-Bilinear Saddle Point Problems , 2017, J. Optim. Theory Appl..

[30] Yurii Nesterov,et al. Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[31] Tony Jebara,et al. Frank-Wolfe Algorithms for Saddle Point Problems , 2016, AISTATS.

[32] Antonin Chambolle,et al. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[33] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .

[34] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[35] N. S. Aybat,et al. A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems , 2018, 1803.01401.

[36] Bao-Gang Hu,et al. Learning with Average Top-k Loss , 2017, NIPS.

[37] Tianbao Yang,et al. An efficient primal dual prox method for non-smooth optimization , 2014, Machine Learning.

[38] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..

[39] Nathan Srebro,et al. Lower Bounds for Non-Convex Stochastic Optimization , 2019, ArXiv.

[40] Panayotis Mertikopoulos,et al. On the convergence of single-call stochastic extra-gradient methods , 2019, NeurIPS.

[41] Mingrui Liu,et al. Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[42] Tianbao Yang,et al. Doubly Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization with Factorized Data , 2015, ArXiv.