A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach

In this paper we consider solving saddle point problems using two variants of Gradient Descent-Ascent algorithms, Extra-gradient (EG) and Optimistic Gradient Descent Ascent (OGDA) methods. We show that both of these algorithms admit a unified analysis as approximations of the classical proximal point method for solving saddle point problems. This viewpoint enables us to develop a new framework for analyzing EG and OGDA for bilinear and strongly convex-strongly concave settings. Moreover, we use the proximal point approximation interpretation to generalize the results for OGDA for a wide range of parameters.

[1]  Jason D. Lee,et al.  Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition , 2018, ArXiv.

[2]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[3]  R. Tyrrell Rockafellar,et al.  Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming , 1976, Math. Oper. Res..

[4]  Jean-Philippe Vial,et al.  Robust Optimization , 2021, ICORES.

[5]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[6]  Tatjana Chavdarova,et al.  Reducing Noise in GAN Training with Variance Reduced Extragradient , 2019, NeurIPS.

[7]  R. Tyrrell Rockafellar,et al.  Convergence Rates in Forward-Backward Splitting , 1997, SIAM J. Optim..

[8]  P. Tseng On linear convergence of iterative methods for the variational inequality problem , 1995 .

[9]  Osman Güer On the convergence of the proximal point algorithm for convex minimization , 1991 .

[10]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[11]  Michael C. Ferris,et al.  Finite termination of the proximal point algorithm , 1991, Math. Program..

[12]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[13]  B. Martinet Brève communication. Régularisation d'inéquations variationnelles par approximations successives , 1970 .

[14]  Wei Hu,et al.  Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity , 2018, AISTATS.

[15]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[16]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[17]  Rong Jin,et al.  25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[18]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[19]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[20]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[21]  Renato D. C. Monteiro,et al.  On the Complexity of the Hybrid Proximal Extragradient Method for the Iterates and the Ergodic Mean , 2010, SIAM J. Optim..

[22]  Tengyuan Liang,et al.  Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks , 2018, AISTATS.

[23]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[24]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[25]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[26]  Jason D. Lee,et al.  On the Convergence and Robustness of Training GANs with Regularized Optimal Transport , 2018, NeurIPS.

[27]  G. Saridis,et al.  Journal of Optimization Theory and Applications Approximate Solutions to the Time-invariant Hamilton-jacobi-bellman Equation 1 , 1998 .

[28]  Stephen P. Boyd,et al.  PID design by convex-concave optimization , 2013, 2013 European Control Conference (ECC).

[29]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[30]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[31]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[32]  Francis R. Bach,et al.  Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.

[33]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[34]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[35]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[36]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[37]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[38]  Yunmei Chen,et al.  Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..

[39]  Amir Beck,et al.  First-Order Methods in Optimization , 2017 .