First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces

From optimal transport to robust dimensionality reduction, a plethora of machine learning applications can be cast into the min-max optimization problems over Riemannian manifolds. Though many min-max algorithms have been analyzed in the Euclidean setting, it has proved elusive to translate these results to the Riemannian case. Zhang et al. [2022] have recently shown that geodesic convex concave Riemannian problems always admit saddle-point solutions. Inspired by this result, we study whether a performance gap between Riemannian and optimal Euclidean space convex-concave algorithms is necessary. We answer this question in the negative—we prove that the Riemannian corrected extragradient (RCEG) method achieves last-iterate convergence at a linear rate in the geodesically strongly-convex-concave case, matching the Euclidean result. Our results also extend to the stochastic or non-smooth case where RCEG and Riemanian gradient ascent descent (RGDA) achieve near-optimal convergence rates up to factors depending on curvature of the manifold.

[1]  Bamdev Mishra,et al.  Riemannian Hamiltonian methods for min-max optimization on manifolds , 2022, ArXiv.

[2]  S. Sra,et al.  Minimax in Geodesic Metric Spaces: Sion's Theorem and Algorithms , 2022, ArXiv.

[3]  M. Hasegawa-Johnson,et al.  Fast and Efficient MMD-based Fair PCA via Optimization over Stiefel Manifold , 2021, AAAI.

[4]  Guanghui Lan,et al.  Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation , 2020, SIAM J. Optim..

[5]  Shuzhong Zhang,et al.  On lower iteration complexity bounds for the convex concave saddle point problems , 2019, Math. Program..

[6]  Yang Cai,et al.  Tight Last-Iterate Convergence of the Extragradient Method for Constrained Monotone Variational Inequalities , 2022, ArXiv.

[7]  Junbin Gao,et al.  On Riemannian Optimization over Positive Definite Matrices with the Bures-Wasserstein Geometry , 2021, NeurIPS.

[8]  D. Schuurmans,et al.  Leveraging Non-uniformity in First-order Non-convex Optimization , 2021, ICML.

[9]  Georgios Piliouras,et al.  Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent , 2021, NeurIPS.

[10]  P. Mertikopoulos,et al.  Survival of the strictest: Stable and unstable equilibria under regularized learning with partial information , 2021, COLT.

[11]  Paul W. Goldberg,et al.  The Complexity of Gradient Descent: CLS = PPAD ∩ PLS , 2020, ArXiv.

[12]  Erfan Yazdandoost Hamedani,et al.  A Primal-Dual Algorithm with Line Search for General Convex-Concave Saddle Point Problems , 2020, SIAM J. Optim..

[13]  Michael I. Jordan,et al.  On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification , 2020, AISTATS.

[14]  Meisam Razaviyayn,et al.  Efficient Search of First-Order Nash Equilibria in Nonconvex-Concave Smooth Min-Max Problems , 2020, SIAM J. Optim..

[15]  Anthony Man-Cho So,et al.  Weakly Convex Optimization over Stiefel Manifold Using Riemannian Subgradient-Type Methods , 2019, SIAM J. Optim..

[16]  Jacob Abernethy,et al.  Last-iterate convergence rates for min-max optimization , 2019, ArXiv.

[17]  Weiwei Kong,et al.  An Accelerated Inexact Proximal Point Method for Solving Nonconvex-Concave Min-Max Problems , 2019, SIAM J. Optim..

[18]  Mingrui Liu,et al.  First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems , 2018, J. Mach. Learn. Res..

[19]  P. Mertikopoulos,et al.  On the Rate of Convergence of Regularized Learning in Games: From Bandits and Uncertainty to Optimism and Beyond , 2021, NeurIPS.

[20]  Georgios Piliouras,et al.  No-regret learning and mixed Nash equilibria: They do not mix , 2020, NeurIPS.

[21]  Feihu Huang,et al.  Gradient Descent Ascent for Minimax Problems on Riemannian Manifolds , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Nicolas Boumal,et al.  An Accelerated First-Order Method for Non-convex Optimization on Manifolds , 2020, Foundations of Computational Mathematics.

[23]  Projection Robust Wasserstein Distance and Riemannian Optimization , 2020, NeurIPS.

[24]  Kimon Antonakopoulos,et al.  Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach , 2020, ICLR.

[25]  Jelena Diakonikolas Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities , 2020, COLT.

[26]  Asuman Ozdaglar,et al.  An Optimal Multistage Stochastic Gradient Method for Minimax Problems , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[27]  Michael I. Jordan,et al.  Near-Optimal Algorithms for Minimax Optimization , 2020, COLT.

[28]  Noah Golowich,et al.  Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems , 2020, COLT.

[29]  Mingrui Liu,et al.  Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets , 2019, ICLR.

[30]  Aurélien Lucchi,et al.  A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization , 2019, AISTATS.

[31]  Z. Wen,et al.  A Brief Introduction to Manifold Optimization , 2019, Journal of the Operations Research Society of China.

[32]  Michael I. Jordan,et al.  On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.

[33]  Yongxin Chen,et al.  Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications , 2019, IEEE Transactions on Signal Processing.

[34]  Michael I. Jordan,et al.  What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[35]  Aryan Mokhtari,et al.  A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.

[36]  Shiqian Ma,et al.  Proximal Gradient Method for Nonsmooth Optimization over the Stiefel Manifold , 2018, SIAM J. Optim..

[37]  S. Shankar Sastry,et al.  On Gradient-Based Learning in Continuous Games , 2018, SIAM J. Math. Data Sci..

[38]  Shiqian Ma,et al.  Primal-dual optimization algorithms over Riemannian manifolds: an iteration complexity analysis , 2017, Mathematical Programming.

[39]  P. Absil,et al.  Erratum to: ``Global rates of convergence for nonconvex optimization on manifolds'' , 2016, IMA Journal of Numerical Analysis.

[40]  Ioannis Mitliagkas,et al.  A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games , 2020, AISTATS.

[41]  Georgios Piliouras,et al.  Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games , 2019, NeurIPS.

[42]  Prateek Jain,et al.  Efficient Algorithms for Smooth Minimax Optimization , 2019, NeurIPS.

[43]  Sehie Park Riemannian manifolds are KKM spaces , 2019, Advances in the Theory of Nonlinear Analysis and its Application.

[44]  Maryam Fazel,et al.  Escaping from saddle points on Riemannian manifolds , 2019, NeurIPS.

[45]  Nicolas Boumal,et al.  Efficiently escaping saddle points on manifolds , 2019, NeurIPS.

[46]  Pierre-Antoine Absil,et al.  A Collection of Nonsmooth Riemannian Optimization Problems , 2019, Nonsmooth Optimization and Its Applications.

[47]  Jason D. Lee,et al.  Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[48]  Michael I. Jordan,et al.  On Nonconvex Optimization for Machine Learning , 2019, J. ACM.

[49]  Hiroyuki Kasai,et al.  Riemannian adaptive stochastic gradient algorithms on matrix manifolds , 2019, ICML.

[50]  Gary Bécigneul,et al.  Riemannian Adaptive Optimization Methods , 2018, ICLR.

[51]  Mathias Staudigl,et al.  Hessian barrier algorithms for linearly constrained optimization problems , 2018, SIAM J. Optim..

[52]  Bo Jiang,et al.  Structured Quasi-Newton Methods for Optimization with Orthogonality Constraints , 2018, SIAM J. Sci. Comput..

[53]  Ioannis Mitliagkas,et al.  Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[54]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[55]  Constantinos Daskalakis,et al.  Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization , 2018, ITCS.

[56]  Stefan Winkler,et al.  The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.

[57]  Anthony Man-Cho So,et al.  Quadratic optimization with orthogonality constraint: explicit Łojasiewicz exponent and linear convergence of retraction-based line-search and stochastic variance-reduced gradient methods , 2018, Mathematical Programming.

[58]  Thomas Hofmann,et al.  Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[59]  Roland Herzog,et al.  Intrinsic Formulation of KKT Conditions and Constraint Qualifications on Smooth Manifolds , 2018, SIAM J. Optim..

[60]  Tengyuan Liang,et al.  Interaction Matters: A Note on Non-asymptotic Local Convergence of Generative Adversarial Networks , 2018, AISTATS.

[61]  Ya-Xiang Yuan,et al.  Adaptive Quadratically Regularized Newton Method for Riemannian Optimization , 2018, SIAM J. Matrix Anal. Appl..

[62]  Constantinos Daskalakis,et al.  The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[63]  Michael I. Jordan,et al.  Averaging Stochastic Gradient Descent on Riemannian Manifolds , 2018, COLT.

[64]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[65]  Xiaojun Chen,et al.  A New First-Order Algorithmic Framework for Optimization Problems with Orthogonality Constraints , 2018, SIAM J. Optim..

[66]  Xueyan Jiang,et al.  Metrics for Deep Generative Models , 2017, AISTATS.

[67]  Xianglong Liu,et al.  Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks , 2017, AAAI.

[68]  Alexander J. Smola,et al.  A Generic Approach for Escaping Saddle points , 2017, AISTATS.

[69]  Bamdev Mishra,et al.  A Unified Framework for Structured Low-rank Matrix Learning , 2017, ICML.

[70]  William H. Sandholm,et al.  Riemannian game dynamics , 2016, J. Econ. Theory.

[71]  Hiroyuki Kasai,et al.  Inexact trust-region algorithms on Riemannian manifolds , 2018, NeurIPS.

[72]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[73]  Michael I. Jordan,et al.  Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[74]  Abhishek Kumar,et al.  Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference , 2017, NIPS.

[75]  Yi Zheng,et al.  No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[76]  Michael I. Jordan,et al.  How to Escape Saddle Points Efficiently , 2017, ICML.

[77]  Jefferson G. Melo,et al.  Iteration-Complexity of Gradient, Subgradient and Proximal Point Methods on Riemannian Manifolds , 2016, Journal of Optimization Theory and Applications.

[78]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere II: Recovery by Riemannian Trust-Region Method , 2015, IEEE Transactions on Information Theory.

[79]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[80]  Suvrit Sra,et al.  Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[81]  Suvrit Sra,et al.  First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[82]  Anima Anandkumar,et al.  Efficient approaches for escaping higher order saddle points in non-convex optimization , 2016, COLT.

[83]  Alexandru Krist'aly Nash-type equilibria on Riemannian manifolds: a variational approach , 2016, 1602.04157.

[84]  A. Zaslavski Proximal Point Algorithm , 2016 .

[85]  Suvrit Sra,et al.  Geometric Optimization in Machine Learning , 2016 .

[86]  Suvrit Sra,et al.  Matrix Manifold Optimization for Gaussian Mixtures , 2015, NIPS.

[87]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[88]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[89]  Suvrit Sra,et al.  Conic Geometric Optimization on the Manifold of Positive Definite Matrices , 2013, SIAM J. Optim..

[90]  Sayan Mukherjee,et al.  The Information Geometry of Mirror Descent , 2013, IEEE Transactions on Information Theory.

[91]  M. Bacák Convex Analysis and Optimization in Hadamard Spaces , 2014 .

[92]  S. Ivanov On Helly's theorem in geodesic spaces , 2014, 1401.6654.

[93]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[94]  Wotao Yin,et al.  A feasible method for optimization with orthogonality constraints , 2013, Math. Program..

[95]  Bart Vandereycken,et al.  Low-Rank Matrix Completion by Riemannian Optimization , 2013, SIAM J. Optim..

[96]  René Vidal,et al.  Riemannian Consensus for Manifolds With Bounded Curvature , 2012, IEEE Transactions on Automatic Control.

[97]  Silvere Bonnabel,et al.  Stochastic Gradient Descent on Riemannian Manifolds , 2011, IEEE Transactions on Automatic Control.

[98]  Ami Wiesel,et al.  Geodesic Convexity and Covariance Estimation , 2012, IEEE Transactions on Signal Processing.

[99]  Pierre-Antoine Absil,et al.  RTRMC: A Riemannian trust-region method for low-rank matrix completion , 2011, NIPS.

[100]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[101]  Massimo Fornasier,et al.  Low-rank Matrix Recovery via Iteratively Reweighted Least Squares Minimization , 2010, SIAM J. Optim..

[102]  J. H. Wang,et al.  Monotone and Accretive Vector Fields on Riemannian Manifolds , 2010 .

[103]  Chong Li,et al.  Monotone vector fields and the proximal point algorithm on Hadamard manifolds , 2009 .

[104]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[105]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[106]  Laurent El Ghaoui,et al.  Robust Optimization , 2021, ICORES.

[107]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[108]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[109]  P. Thomas Fletcher,et al.  Riemannian geometry for the statistical analysis of diffusion tensor data , 2007, Signal Process..

[110]  H. Robbins A Stochastic Approximation Method , 1951 .

[111]  Orizon Pereira Ferreira,et al.  Singularities of Monotone Vector Fields and an Extragradient-type Algorithm , 2005, J. Glob. Optim..

[112]  Xavier Pennec,et al.  A Riemannian Framework for Tensor Computing , 2005, International Journal of Computer Vision.

[113]  Max-K. von Renesse,et al.  Heat Kernel Comparison on Alexandrov Spaces with Curvature Bounded Below , 2004 .

[114]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[115]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[116]  John M. Lee Introduction to Smooth Manifolds , 2002 .

[117]  Arcwise Isometries,et al.  A Course in Metric Geometry , 2001 .

[118]  O. P. Ferreira,et al.  Subgradient Algorithm on Riemannian Manifolds , 1998 .

[119]  J. Eschenburg Comparison Theorems in Riemannian Geometry , 1994 .

[120]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[121]  D. Ruppert,et al.  Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .

[122]  H. Komiya Elementary proof for Sion's minimax theorem , 1988 .

[123]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[124]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[125]  M. Sion On general minimax theorems , 1958 .

[126]  S. Kakutani A generalization of Brouwer’s fixed point theorem , 1941 .

[127]  Bronisław Knaster,et al.  Ein Beweis des Fixpunktsatzes für n-dimensionale Simplexe , 1929 .

[128]  J. Neumann Zur Theorie der Gesellschaftsspiele , 1928 .

[129]  I. Holopainen Riemannian Geometry , 1927, Nature.

[130]  E. Helly Über Mengen konvexer Körper mit gemeinschaftlichen Punkte. , 1923 .

[131]  L. Brouwer Über Abbildung von Mannigfaltigkeiten , 1911 .