Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization

We consider the smooth convex-concave bilinearly-coupled saddle-point problem, min x max y F ( x )+ H ( x , y ) − G ( y ), where one has access to stochastic first-order oracles for F , G as well as the bilinear coupling function H . Building upon standard stochastic extragradient analysis for variational inequalities, we present a stochastic accelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragradient and Nesterov’s acceleration in general stochastic settings. This algorithm leverages scheduled restarting to admit a fine-grained nonasymptotic convergence rate that matches known lower bounds by both Ibrahim et al. [2020] and Zhang et al. [2021a] in their corresponding settings, plus an additional statistical error term for bounded stochastic noise that is optimal up to a constant prefactor. This is the first result that achieves such a relatively mature characterization of optimality in saddle-point optimization.

[1]  Kevin Tian,et al.  Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods , 2022, COLT.

[2]  Niao He,et al.  Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax Optimization , 2022, AISTATS.

[3]  Nicolas Le Roux,et al.  On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging , 2021, AISTATS.

[4]  Guanghui Lan,et al.  Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation , 2020, SIAM J. Optim..

[5]  Peter Richtárik,et al.  Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling , 2021, NeurIPS.

[6]  Niao He,et al.  The Complexity of Nonconvex-Strongly-Concave Minimax Optimization , 2021, UAI.

[7]  Zhihua Zhang,et al.  DIPPA: An improved Method for Bilinear Saddle Point Problems , 2021, ArXiv.

[8]  Kevin Tian,et al.  Relative Lipschitzness in Extragradient Methods and a Direct Recipe for Acceleration , 2020, ITCS.

[9]  Arthur Mensch,et al.  Extra-gradient with player sampling for faster convergence in n-player games , 2020, ICML.

[10]  Ioannis Mitliagkas,et al.  Stochastic Hamiltonian Gradient Methods for Smooth Games , 2020, ICML.

[11]  Yuanhao Wang,et al.  Improved Algorithms for Convex-Concave Minimax Optimization , 2020, NeurIPS.

[12]  J. Malick,et al.  Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling , 2020, NeurIPS.

[13]  Asuman Ozdaglar,et al.  An Optimal Multistage Stochastic Gradient Method for Minimax Problems , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[14]  Michael I. Jordan,et al.  Near-Optimal Algorithms for Minimax Optimization , 2020, COLT.

[15]  Ioannis Mitliagkas,et al.  Accelerating Smooth Games by Manipulating Spectral Shapes , 2020, AISTATS.

[16]  Ioannis Mitliagkas,et al.  Linear Lower Bounds and Conditioning of Differentiable Games , 2019, ICML.

[17]  Peter Richtárik,et al.  Revisiting Stochastic Extragradient , 2019, AISTATS.

[18]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[19]  Ioannis Mitliagkas,et al.  A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games , 2020, AISTATS.

[20]  Guanghui Lan,et al.  First-order and Stochastic Optimization Methods for Machine Learning , 2020 .

[21]  Kun Yuan,et al.  ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs , 2019, ArXiv.

[22]  Ioannis Mitliagkas,et al.  Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[23]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[24]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[25]  Tengyuan Liang,et al.  Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks , 2018, AISTATS.

[26]  Weizhu Chen,et al.  DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization , 2017, J. Mach. Learn. Res..

[27]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[28]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[29]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[30]  Lin Xiao,et al.  Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms , 2017, ICML.

[31]  Alfredo N. Iusem,et al.  Extragradient Method with Variance Reduction for Stochastic Variational Inequalities , 2017, SIAM J. Optim..

[32]  Lihong Li,et al.  Stochastic Variance Reduction Methods for Policy Evaluation , 2017, ICML.

[33]  Yunmei Chen,et al.  Accelerated schemes for a class of variational inequalities , 2014, Mathematical Programming.

[34]  Yuchen Zhang,et al.  Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization , 2014, ICML.

[35]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[36]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[37]  Yurii Nesterov,et al.  Solving Strongly Monotone Variational and Quasi-Variational Inequalities , 2006 .

[38]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[39]  Y. Nesterov A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .

[40]  M. Sion On general minimax theorems , 1958 .