Solving Weakly-Convex-Weakly-Concave Saddle-Point Problems as Successive Strongly Monotone Variational Inequalities

In this paper, we consider first-order algorithms for solving a class of non-convex non-concave min-max saddle-point problems, whose objective function is weakly convex (resp. weakly concave) in terms of the variable of minimization (resp. maximization). It has many important applications in machine learning, statistics, and operations research. One such example that attracts tremendous attention recently in machine learning is training Generative Adversarial Networks. We propose an algorithmic framework motivated by the inexact proximal point method, which solves the weakly monotone variational inequality corresponding to the original min-max problem by approximately solving a sequence of strongly monotone variational inequalities constructed by adding a strongly monotone mapping to the original gradient mapping. In this sequence, each strongly monotone variational inequality is defined with a proximal center that is updated using the approximate solution of the previous variational inequality. Our algorithm generates a sequence of solution that provably converges to a nearly stationary solution of the original min-max problem. The proposed framework is flexible because various subroutines can be employed for solving the strongly monotone variational inequalities. The overall computational complexities of our methods are established when the employed subroutines are subgradient method, stochastic subgradient method, gradient descent method and Nesterov's accelerated method and variance reduction methods for a Lipschitz continuous operator. To the best of our knowledge, this is the first work that establishes the non-asymptotic convergence to a nearly stationary point of a non-convex non-concave min-max problem.

[1]  G. Minty Monotone (nonlinear) operators in Hilbert space , 1962 .

[2]  G. Stampacchia,et al.  On some non-linear elliptic differential-functional equations , 1966 .

[3]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[4]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[5]  Ronald E. Bruck On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space , 1977 .

[6]  F. Luque Asymptotic convergence analysis of the proximal point algorithm , 1984 .

[7]  Patrick T. Harker,et al.  Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications , 1990, Math. Program..

[8]  Osman Güer On the convergence of the proximal point algorithm for convex minimization , 1991 .

[9]  Michael C. Ferris,et al.  Finite termination of the proximal point algorithm , 1991, Math. Program..

[10]  Jean-Pierre Crouzeix,et al.  Pseudomonotone variational inequality problems: Existence of solutions , 1997, Math. Program..

[11]  I. V. Konnov,et al.  On Quasimonotone Variational Inequalities , 1998 .

[12]  M. Solodov,et al.  A New Projection Method for Variational Inequality Problems , 1999 .

[13]  Nicolas Hadjisavvas,et al.  Coercivity conditions and variational inequalities , 1999, Math. Program..

[14]  Nicolas Hadjisavvas,et al.  Characterization of Nonsmooth Semistrictly Quasiconvex and Strictly Quasiconvex Functions , 1999 .

[15]  Naihua Xiu,et al.  Unified Framework of Extragradient-Type Methods for Pseudomonotone Variational Inequalities , 2001 .

[16]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[17]  T. Q. Bao,et al.  A Projection-Type Algorithm for Pseudomonotone Nonlipschitzian Multivalued Variational Inequalities , 2005 .

[18]  Yurii Nesterov,et al.  Solving Strongly Monotone Variational and Quasi-Variational Inequalities , 2006 .

[19]  Elisabetta Allevi,et al.  The Proximal Point Method for Nonmonotone Variational Inequalities , 2006, Math. Methods Oper. Res..

[20]  Yurii Nesterov,et al.  Dual extrapolation and its applications to solving variational inequalities and related problems , 2003, Math. Program..

[21]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[22]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[23]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[24]  Joydeep Dutta,et al.  Inexact Proximal Point Methods for Variational Inequality Problems , 2010, SIAM J. Optim..

[25]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[26]  Jean-Yves Audibert,et al.  Robust linear least squares regression , 2010, 1010.0074.

[27]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[28]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[29]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[30]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[31]  Tianbao Yang,et al.  An efficient primal dual prox method for non-smooth optimization , 2014, Machine Learning.

[32]  Guanghui Lan,et al.  On the convergence properties of non-Euclidean extragradient methods for variational inequalities with generalized monotone operators , 2013, Comput. Optim. Appl..

[33]  Saeed Ghadimi,et al.  Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[34]  Alexander J. Smola,et al.  Fast incremental method for smooth nonconvex optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[35]  Francis R. Bach,et al.  Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.

[36]  Zeyuan Allen Zhu,et al.  Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.

[37]  Alexander J. Smola,et al.  Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.

[38]  Tianbao Yang,et al.  Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization , 2016, 1604.03257.

[39]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[40]  ASHISH CHERUKURI,et al.  Saddle-Point Dynamics: Conditions for Asymptotic Stability of Saddle Points , 2015, SIAM J. Control. Optim..

[41]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[42]  Yuanzhi Li,et al.  Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.

[43]  Zeyuan Allen-Zhu,et al.  Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter , 2017, ICML.

[44]  Robert S. Chen,et al.  Robust Optimization for Non-Convex Objectives , 2017, NIPS.

[45]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[46]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[47]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[48]  Sashank J. Reddi,et al.  On the Convergence of Adam and Beyond , 2018, ICLR.

[49]  Le Song,et al.  SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation , 2017, ICML.

[50]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[51]  Yuanzhi Li,et al.  An Alternative View: When Does SGD Escape Local Minima? , 2018, ICML.

[52]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[53]  Constantinos Daskalakis,et al.  The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[54]  A Variance Reduction Method for Non-Convex Optimization with Improved Convergence under Large Condition Number , 2018 .

[55]  Niao He,et al.  On the Convergence Rate of Stochastic Mirror Descent for Nonsmooth Nonconvex Optimization , 2018, 1806.04781.

[56]  Yair Carmon,et al.  Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[57]  Mingrui Liu,et al.  Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning , 2018, ArXiv.

[58]  Dmitriy Drusvyatskiy,et al.  Stochastic subgradient method converges at the rate O(k-1/4) on weakly convex functions , 2018, ArXiv.

[59]  Rong Jin,et al.  Learning with Non-Convex Truncated Losses by SGD , 2018, UAI.

[60]  Damek Davis,et al.  Proximally Guided Stochastic Subgradient Method for Nonsmooth, Nonconvex Problems , 2017, SIAM J. Optim..

[61]  Rong Jin,et al.  Robust Optimization over Multiple Domains , 2018, AAAI.

[62]  Regina S. Burachik,et al.  A Projection Algorithm for Non-Monotone Variational Inequalities , 2016, Set-Valued and Variational Analysis.

[63]  Dmitriy Drusvyatskiy,et al.  Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[64]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[65]  Enhong Chen,et al.  Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions , 2018, ICLR.

[66]  Guanghui Lan,et al.  Accelerated Stochastic Algorithms for Nonconvex Finite-sum and Multi-block Optimization , 2018, 1805.05411.

[67]  Dmitriy Drusvyatskiy,et al.  Efficiency of minimizing compositions of convex functions and smooth maps , 2016, Math. Program..