Adaptive regularization with cubics on manifolds

Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex optimization. Akin to the popular trust-region method, its iterations can be thought of as approximate, safe-guarded Newton steps. For cost functions with Lipschitz continuous Hessian, ARC has optimal iteration complexity, in the sense that it produces an iterate with gradient smaller than $\varepsilon$ in $O(1/\varepsilon^{1.5})$ iterations. For the same price, it can also guarantee a Hessian with smallest eigenvalue larger than $-\varepsilon^{1/2}$. In this paper, we study a generalization of ARC to optimization on Riemannian manifolds. In particular, we generalize the iteration complexity results to this richer framework. Our central contribution lies in the identification of appropriate manifold-specific assumptions that allow us to secure these complexity guarantees both when using the exponential map and when using a general retraction. A substantial part of the paper is devoted to studying these assumptions---relevant beyond ARC---and providing user-friendly sufficient conditions for them. Numerical experiments are encouraging.

[1]  Yi Zhou,et al.  Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization , 2018, AISTATS.

[2]  Nicholas I. M. Gould,et al.  Error estimates for iterative algorithms for minimizing regularized quadratic subproblems , 2020, Optim. Methods Softw..

[3]  Amit Singer,et al.  Robust estimation of rotations from relative measurements by maximum likelihood , 2013, 52nd IEEE Conference on Decision and Control.

[4]  C. Berge Topological Spaces: including a treatment of multi-valued functions , 2010 .

[5]  Philippe L. Toint,et al.  WORST-CASE EVALUATION COMPLEXITY AND OPTIMALITY OF SECOND-ORDER METHODS FOR NONCONVEX SMOOTH OPTIMIZATION , 2017, Proceedings of the International Congress of Mathematicians (ICM 2018).

[6]  Chunhong Qi Numerical Optimization Methods on Riemannian Manifolds , 2011 .

[7]  Quanquan Gu,et al.  Stochastic Variance-Reduced Cubic Regularized Newton Method , 2018, ICML.

[8]  Quanquan Gu,et al.  Stochastic Variance-Reduced Cubic Regularized Newton Method , 2018, ICML.

[9]  Shuzhong Zhang,et al.  Adaptive Stochastic Variance Reduction for Subsampled Newton Method with Cubic Regularization , 2018, INFORMS Journal on Optimization.

[10]  L. Trefethen,et al.  Numerical linear algebra , 1997 .

[11]  Pierre-Antoine Absil,et al.  RTRMC: A Riemannian trust-region method for low-rank matrix completion , 2011, NIPS.

[12]  John M. Lee Introduction to Riemannian Manifolds , 2019 .

[13]  Bryan Zhu Algorithms for Optimization on Manifolds Using Adaptive Cubic Regularization , 2019 .

[14]  José Mario Martínez,et al.  Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, Math. Program..

[15]  Jefferson G. Melo,et al.  Iteration-Complexity of Gradient, Subgradient and Proximal Point Methods on Riemannian Manifolds , 2016, Journal of Optimization Theory and Applications.

[16]  Shuzhong Zhang,et al.  A Cubic Regularized Newton's Method over Riemannian Manifolds , 2018, 1805.05565.

[17]  Tengyu Ma,et al.  Finding approximate local minima faster than gradient descent , 2016, STOC.

[18]  John M. Lee Riemannian Manifolds: An Introduction to Curvature , 1997 .

[19]  Aurélien Lucchi,et al.  Sub-sampled Cubic Regularization for Non-convex Optimization , 2017, ICML.

[20]  D. Luenberger The Gradient Projection Method Along Geodesics , 1972 .

[21]  Suvrit Sra,et al.  Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[22]  R. Adler,et al.  Newton's method on Riemannian manifolds and a geometric model for the human spine , 2002 .

[23]  Jean-Pierre Dussault ARCq: a new adaptive regularization by cubics , 2018, Optim. Methods Softw..

[24]  Michael I. Jordan,et al.  Stochastic Gradient Descent Escapes Saddle Points Efficiently , 2019, ArXiv.

[25]  Steven Thomas Smith,et al.  Optimization Techniques on Riemannian Manifolds , 2014, ArXiv.

[26]  D. Gabay Minimizing a differentiable function over a differential manifold , 1982 .

[27]  Nicholas I. M. Gould,et al.  Updating the regularization parameter in the adaptive cubic regularization algorithm , 2012, Comput. Optim. Appl..

[28]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[29]  Orizon Pereira Ferreira,et al.  Kantorovich's Theorem on Newton's Method in Riemannian Manifolds , 2002, J. Complex..

[30]  Renato D. C. Monteiro,et al.  Digital Object Identifier (DOI) 10.1007/s10107-004-0564-1 , 2004 .

[31]  Benedikt Wirth,et al.  Optimization Methods on Riemannian Manifolds and Their Application to Shape Space , 2012, SIAM J. Optim..

[32]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[33]  Pierre-Antoine Absil,et al.  Trust-Region Methods on Riemannian Manifolds , 2007, Found. Comput. Math..

[34]  Michael I. Jordan,et al.  Averaging Stochastic Gradient Descent on Riemannian Manifolds , 2018, COLT.

[35]  Ya-Xiang Yuan,et al.  Adaptive Quadratically Regularized Newton Method for Riemannian Optimization , 2018, SIAM J. Matrix Anal. Appl..

[36]  Yair Carmon,et al.  Lower bounds for finding stationary points I , 2017, Mathematical Programming.

[37]  Nicholas I. M. Gould,et al.  Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results , 2011, Math. Program..

[38]  Nicholas I. M. Gould,et al.  Complexity bounds for second-order optimality in unconstrained optimization , 2012, J. Complex..

[39]  Michael I. Jordan,et al.  Stochastic Cubic Regularization for Fast Nonconvex Optimization , 2017, NeurIPS.

[40]  Silvere Bonnabel,et al.  Stochastic Gradient Descent on Riemannian Manifolds , 2011, IEEE Transactions on Automatic Control.

[41]  W. Yang,et al.  Optimality conditions for the nonlinear programming problems on Riemannian manifolds , 2012 .

[42]  P. Toint,et al.  Optimal Newton-type methods for nonconvex smooth optimization problems , 2011 .

[43]  Nicholas I. M. Gould,et al.  Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity , 2011, Math. Program..

[44]  Nicholas I. M. Gould,et al.  Improved second-order evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, ArXiv.

[45]  A. Tustin Automatic Control , 1951, Nature.

[46]  Nicholas I. M. Gould,et al.  Solving the Trust-Region Subproblem using the Lanczos Method , 1999, SIAM J. Optim..

[47]  Nicolas Boumal,et al.  Efficiently escaping saddle points on manifolds , 2019, NeurIPS.

[48]  Maher Moakher,et al.  Symmetric Positive-Definite Matrices: From Geometry to Applications and Visualization , 2006, Visualization and Processing of Tensor Fields.

[49]  Vladislav Voroninski,et al.  ShapeFit: Exact Location Recovery from Corrupted Pairwise Directions , 2015, ArXiv.

[50]  Jérôme Malick,et al.  Projection-like Retractions on Matrix Manifolds , 2012, SIAM J. Optim..

[51]  Maryam Fazel,et al.  Escaping from saddle points on Riemannian manifolds , 2019, NeurIPS.

[52]  Yurii Nesterov,et al.  Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[53]  Suvrit Sra,et al.  First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[54]  R. Bhatia Positive Definite Matrices , 2007 .

[55]  Yair Carmon,et al.  Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems , 2018, NeurIPS.

[56]  Francis R. Bach,et al.  Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[57]  R. Bishop,et al.  Geometry of Manifolds , 1964 .

[58]  Hiroyuki Sato,et al.  A Riemannian Optimization Approach to the Matrix Singular Value Decomposition , 2013, SIAM J. Optim..

[59]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[60]  F. Bach,et al.  Low-rank optimization for semidefinite convex problems , 2008, 0807.4423.

[61]  M. MacCallum,et al.  SEMI‐RIEMANNIAN GEOMETRY With Applications to Relativity , 1984 .

[62]  S. Waldmann Geometric Wave Equations , 2012, 1208.4706.

[63]  A. Bandeira,et al.  Deterministic Guarantees for Burer‐Monteiro Factorizations of Smooth Semidefinite Programs , 2018, Communications on Pure and Applied Mathematics.

[64]  Yair Carmon,et al.  Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step , 2019, SIAM J. Optim..

[65]  P. Absil,et al.  Erratum to: ``Global rates of convergence for nonconvex optimization on manifolds'' , 2016, IMA Journal of Numerical Analysis.

[66]  B. O'neill Semi-Riemannian Geometry With Applications to Relativity , 1983 .