An Estimate Sequence for Geodesically Convex Optimization

We propose a Riemannian version of Nesterov’s Accelerated Gradient algorithm (RAGD), and show that for geodesically smooth and strongly convex problems, within a neighborhood of the minimizer whose radius depends on the condition number as well as the sectional curvature of the manifold, RAGD converges to the minimizer with acceleration. Unlike the algorithm in (Liu et al., 2017) that requires the exact solution to a nonlinear equation which in turn may be intractable, our algorithm is constructive and computationally tractable1. Our proof exploits a new estimate sequence and a novel bound on the nonlinear metric distortion, both ideas may be of independent interest.

[1]  Mark W. Schmidt,et al.  Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.

[2]  M. Bacák Convex Analysis and Optimization in Hadamard Spaces , 2014 .

[3]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[4]  Yi Zheng,et al.  No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[5]  Ohad Shamir,et al.  On Lower and Upper Bounds for Smooth and Strongly Convex Optimization Problems , 2015, ArXiv.

[6]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, STOC '84.

[7]  Ohad Shamir,et al.  A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate , 2014, ICML.

[8]  D. Burago,et al.  A Course in Metric Geometry , 2001 .

[9]  Boris Polyak Gradient methods for the minimisation of functionals , 1963 .

[10]  Zeyuan Allen Zhu,et al.  Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent , 2014, ITCS.

[11]  Bamdev Mishra,et al.  Riemannian Preconditioning , 2014, SIAM J. Optim..

[12]  Yair Carmon,et al.  "Convex Until Proven Guilty": Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions , 2017, ICML.

[13]  Mohit Singh,et al.  A geometric alternative to Nesterov's accelerated gradient descent , 2015, ArXiv.

[14]  Hiroyuki Kasai,et al.  Riemannian stochastic variance reduced gradient , 2016, SIAM J. Optim..

[15]  Suvrit Sra,et al.  Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[16]  P. Absil,et al.  Erratum to: ``Global rates of convergence for nonconvex optimization on manifolds'' , 2016, IMA Journal of Numerical Analysis.

[17]  Nicolas Boumal,et al.  The non-convex Burer-Monteiro approach works on smooth semidefinite programs , 2016, NIPS.

[18]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[19]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[20]  Wolfgang Meyer,et al.  Toponogov's Theorem and Applications , 2004 .

[21]  Yair Carmon,et al.  Accelerated Methods for Non-Convex Optimization , 2016, SIAM J. Optim..

[22]  Tengyu Ma,et al.  Finding Approximate Local Minima for Nonconvex Optimization in Linear Time , 2016, ArXiv.

[23]  Hong Cheng,et al.  Accelerated First-order Methods for Geodesically Convex Optimization on Riemannian Manifolds , 2017, NIPS.

[24]  O. P. Ferreira,et al.  Proximal Point Algorithm On Riemannian Manifolds , 2002 .

[25]  L. Khachiyan Polynomial algorithms in linear programming , 1980 .

[26]  Alexander J. Smola,et al.  Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.

[27]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[28]  C. Udriste,et al.  Convex Functions and Optimization Methods on Riemannian Manifolds , 1994 .

[29]  Francis R. Bach,et al.  From Averaging to Acceleration, There is Only a Step-size , 2015, COLT.

[30]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[31]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[32]  L. Ambrosio,et al.  Metric measure spaces with Riemannian Ricci curvature bounded from below , 2011, 1109.0222.

[33]  Benjamin Recht,et al.  Analysis and Design of Optimization Algorithms via Integral Quadratic Constraints , 2014, SIAM J. Optim..

[34]  Yurii Nesterov,et al.  Lectures on Convex Optimization , 2018 .

[35]  Benar Fux Svaiter,et al.  Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..

[36]  Suvrit Sra,et al.  First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[37]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[38]  Kenji Kawaguchi,et al.  Deep Learning without Poor Local Minima , 2016, NIPS.

[39]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[40]  J. Jost Riemannian geometry and geometric analysis , 1995 .