论文信息 - Adaptive regularization with cubics on manifolds - 字舞流文

Adaptive regularization with cubics on manifolds

Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex optimization. Akin to the popular trust-region method, its iterations can be thought of as approximate, safe-guarded Newton steps. For cost functions with Lipschitz continuous Hessian, ARC has optimal iteration complexity, in the sense that it produces an iterate with gradient smaller than $\varepsilon$ in $O(1/\varepsilon^{1.5})$ iterations. For the same price, it can also guarantee a Hessian with smallest eigenvalue larger than $-\varepsilon^{1/2}$. In this paper, we study a generalization of ARC to optimization on Riemannian manifolds. In particular, we generalize the iteration complexity results to this richer framework. Our central contribution lies in the identification of appropriate manifold-specific assumptions that allow us to secure these complexity guarantees both when using the exponential map and when using a general retraction. A substantial part of the paper is devoted to studying these assumptions---relevant beyond ARC---and providing user-friendly sufficient conditions for them. Numerical experiments are encouraging.

Nicolas Boumal | Naman Agarwal | Coralia Cartis | Brian Bullins | Naman Agarwal | Nicolas Boumal | C. Cartis | Brian Bullins

[1] Yi Zhou,et al. Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization , 2018, AISTATS.

[2] Nicholas I. M. Gould,et al. Error estimates for iterative algorithms for minimizing regularized quadratic subproblems , 2020, Optim. Methods Softw..

[3] Amit Singer,et al. Robust estimation of rotations from relative measurements by maximum likelihood , 2013, 52nd IEEE Conference on Decision and Control.

[4] C. Berge. Topological Spaces: including a treatment of multi-valued functions , 2010 .

[5] Philippe L. Toint,et al. WORST-CASE EVALUATION COMPLEXITY AND OPTIMALITY OF SECOND-ORDER METHODS FOR NONCONVEX SMOOTH OPTIMIZATION , 2017, Proceedings of the International Congress of Mathematicians (ICM 2018).

[6] Chunhong Qi. Numerical Optimization Methods on Riemannian Manifolds , 2011 .

[7] Quanquan Gu,et al. Stochastic Variance-Reduced Cubic Regularized Newton Method , 2018, ICML.

[8] Quanquan Gu,et al. Stochastic Variance-Reduced Cubic Regularized Newton Method , 2018, ICML.

[9] Shuzhong Zhang,et al. Adaptive Stochastic Variance Reduction for Subsampled Newton Method with Cubic Regularization , 2018, INFORMS Journal on Optimization.

[10] L. Trefethen,et al. Numerical linear algebra , 1997 .

[11] Pierre-Antoine Absil,et al. RTRMC: A Riemannian trust-region method for low-rank matrix completion , 2011, NIPS.

[12] John M. Lee. Introduction to Riemannian Manifolds , 2019 .

[13] Bryan Zhu. Algorithms for Optimization on Manifolds Using Adaptive Cubic Regularization , 2019 .

[14] José Mario Martínez,et al. Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, Math. Program..

[15] Jefferson G. Melo,et al. Iteration-Complexity of Gradient, Subgradient and Proximal Point Methods on Riemannian Manifolds , 2016, Journal of Optimization Theory and Applications.

[16] Shuzhong Zhang,et al. A Cubic Regularized Newton's Method over Riemannian Manifolds , 2018, 1805.05565.

[17] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.

[18] John M. Lee. Riemannian Manifolds: An Introduction to Curvature , 1997 .

[19] Aurélien Lucchi,et al. Sub-sampled Cubic Regularization for Non-convex Optimization , 2017, ICML.

[20] D. Luenberger. The Gradient Projection Method Along Geodesics , 1972 .

[21] Suvrit Sra,et al. Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[22] R. Adler,et al. Newton's method on Riemannian manifolds and a geometric model for the human spine , 2002 .

[23] Jean-Pierre Dussault. ARCq: a new adaptive regularization by cubics , 2018, Optim. Methods Softw..

[24] Michael I. Jordan,et al. Stochastic Gradient Descent Escapes Saddle Points Efficiently , 2019, ArXiv.

[25] Steven Thomas Smith,et al. Optimization Techniques on Riemannian Manifolds , 2014, ArXiv.

[26] D. Gabay. Minimizing a differentiable function over a differential manifold , 1982 .

[27] Nicholas I. M. Gould,et al. Updating the regularization parameter in the adaptive cubic regularization algorithm , 2012, Comput. Optim. Appl..

[28] Bamdev Mishra,et al. Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[29] Orizon Pereira Ferreira,et al. Kantorovich's Theorem on Newton's Method in Riemannian Manifolds , 2002, J. Complex..

[30] Renato D. C. Monteiro,et al. Digital Object Identifier (DOI) 10.1007/s10107-004-0564-1 , 2004 .

[31] Benedikt Wirth,et al. Optimization Methods on Riemannian Manifolds and Their Application to Shape Space , 2012, SIAM J. Optim..

[32] Robert E. Mahony,et al. Optimization Algorithms on Matrix Manifolds , 2007 .

[33] Pierre-Antoine Absil,et al. Trust-Region Methods on Riemannian Manifolds , 2007, Found. Comput. Math..

[34] Michael I. Jordan,et al. Averaging Stochastic Gradient Descent on Riemannian Manifolds , 2018, COLT.

[35] Ya-Xiang Yuan,et al. Adaptive Quadratically Regularized Newton Method for Riemannian Optimization , 2018, SIAM J. Matrix Anal. Appl..

[36] Yair Carmon,et al. Lower bounds for finding stationary points I , 2017, Mathematical Programming.

[37] Nicholas I. M. Gould,et al. Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results , 2011, Math. Program..

[38] Nicholas I. M. Gould,et al. Complexity bounds for second-order optimality in unconstrained optimization , 2012, J. Complex..

[39] Michael I. Jordan,et al. Stochastic Cubic Regularization for Fast Nonconvex Optimization , 2017, NeurIPS.

[40] Silvere Bonnabel,et al. Stochastic Gradient Descent on Riemannian Manifolds , 2011, IEEE Transactions on Automatic Control.

[41] W. Yang,et al. Optimality conditions for the nonlinear programming problems on Riemannian manifolds , 2012 .

[42] P. Toint,et al. Optimal Newton-type methods for nonconvex smooth optimization problems , 2011 .

[43] Nicholas I. M. Gould,et al. Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity , 2011, Math. Program..

[44] Nicholas I. M. Gould,et al. Improved second-order evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, ArXiv.

[45] A. Tustin. Automatic Control , 1951, Nature.

[46] Nicholas I. M. Gould,et al. Solving the Trust-Region Subproblem using the Lanczos Method , 1999, SIAM J. Optim..

[47] Nicolas Boumal,et al. Efficiently escaping saddle points on manifolds , 2019, NeurIPS.

[48] Maher Moakher,et al. Symmetric Positive-Definite Matrices: From Geometry to Applications and Visualization , 2006, Visualization and Processing of Tensor Fields.

[49] Vladislav Voroninski,et al. ShapeFit: Exact Location Recovery from Corrupted Pairwise Directions , 2015, ArXiv.

[50] Jérôme Malick,et al. Projection-like Retractions on Matrix Manifolds , 2012, SIAM J. Optim..

[51] Maryam Fazel,et al. Escaping from saddle points on Riemannian manifolds , 2019, NeurIPS.

[52] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[53] Suvrit Sra,et al. First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[54] R. Bhatia. Positive Definite Matrices , 2007 .

[55] Yair Carmon,et al. Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems , 2018, NeurIPS.

[56] Francis R. Bach,et al. Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[57] R. Bishop,et al. Geometry of Manifolds , 1964 .

[58] Hiroyuki Sato,et al. A Riemannian Optimization Approach to the Matrix Singular Value Decomposition , 2013, SIAM J. Optim..

[59] Alan Edelman,et al. The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[60] F. Bach,et al. Low-rank optimization for semidefinite convex problems , 2008, 0807.4423.

[61] M. MacCallum,et al. SEMI‐RIEMANNIAN GEOMETRY With Applications to Relativity , 1984 .

[62] S. Waldmann. Geometric Wave Equations , 2012, 1208.4706.

[63] A. Bandeira,et al. Deterministic Guarantees for Burer‐Monteiro Factorizations of Smooth Semidefinite Programs , 2018, Communications on Pure and Applied Mathematics.

[64] Yair Carmon,et al. Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step , 2019, SIAM J. Optim..

[65] P. Absil,et al. Erratum to: ``Global rates of convergence for nonconvex optimization on manifolds'' , 2016, IMA Journal of Numerical Analysis.

[66] B. O'neill. Semi-Riemannian Geometry With Applications to Relativity , 1983 .