Inexact trust-region algorithms on Riemannian manifolds

We consider an inexact variant of the popular Riemannian trust-region algorithm for structured big-data minimization problems. The proposed algorithm approximates the gradient and the Hessian in addition to the solution of a trust-region sub-problem. Addressing large-scale finite-sum problems, we specifically propose sub-sampled algorithms with a fixed bound on sub-sampled Hessian and gradient sizes, where the gradient and Hessian are computed by a random sampling technique. Numerical evaluations demonstrate that the proposed algorithms outperform state-of-the-art Riemannian deterministic and stochastic gradient algorithms across different applications.

[1]  Andrea Montanari,et al.  Convergence rates of sub-sampled Newton methods , 2015, NIPS.

[2]  Jorge Nocedal,et al.  On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..

[3]  P. Absil,et al.  Low-rank matrix completion via preconditioned optimization on the Grassmann manifold , 2015, Linear Algebra and its Applications.

[4]  Hiroyuki Kasai,et al.  Riemannian stochastic variance reduced gradient , 2016, SIAM J. Optim..

[5]  Bamdev Mishra,et al.  A Dual Framework for Low-rank Tensor Completion , 2017, NeurIPS.

[6]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[7]  Richard Kueng,et al.  RIPless compressed sensing from anisotropic measurements , 2012, ArXiv.

[8]  Suvrit Sra,et al.  Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[9]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.

[10]  Fabian J. Theis,et al.  Soft Dimension Reduction for ICA by Joint Diagonalization on the Stiefel Manifold , 2009, ICA.

[11]  Bamdev Mishra,et al.  R3MC: A Riemannian three-factor algorithm for low-rank matrix completion , 2013, 53rd IEEE Conference on Decision and Control.

[12]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[13]  Felix J. Herrmann,et al.  Optimization on the Hierarchical Tucker manifold – Applications to tensor completion , 2014, Linear Algebra and its Applications.

[14]  D. Gleich TRUST REGION METHODS , 2017 .

[15]  Daphna Weinshall,et al.  Online Learning in the Embedded Manifold of Low-rank Matrices , 2012, J. Mach. Learn. Res..

[16]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[17]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[18]  Aurélien Lucchi,et al.  Sub-sampled Cubic Regularization for Non-convex Optimization , 2017, ICML.

[19]  Hiroyuki Kasai,et al.  A Riemannian gossip approach to subspace learning on Grassmann manifold , 2017, Machine Learning.

[20]  Alexander J. Smola,et al.  Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.

[21]  Robert D. Nowak,et al.  Online identification and tracking of subspaces from highly incomplete information , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[23]  Bart Vandereycken,et al.  Low-rank tensor completion by Riemannian optimization , 2014 .

[24]  Peng Xu,et al.  Inexact Non-Convex Newton-Type Methods , 2018, 1802.06925.

[25]  Shai Shalev-Shwartz,et al.  Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..

[26]  Nicholas I. M. Gould,et al.  Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results , 2011, Math. Program..

[27]  Hiroyuki Kasai SGDLibrary: A MATLAB library for stochastic optimization algorithms , 2017, J. Mach. Learn. Res..

[28]  Hiroyuki Kasai,et al.  Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis , 2017, AISTATS.

[29]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[30]  W. Yang,et al.  Optimality conditions for the nonlinear programming problems on Riemannian manifolds , 2012 .

[31]  H. Robbins A Stochastic Approximation Method , 1951 .

[32]  D. Luenberger The Gradient Projection Method Along Geodesics , 1972 .

[33]  Wen Huang,et al.  A Riemannian BFGS Method for Nonconvex Optimization Problems , 2015, ENUMATH.

[34]  Peng Xu,et al.  Newton-type methods for non-convex optimization under inexact Hessian information , 2017, Math. Program..

[35]  D. Gabay Minimizing a differentiable function over a differential manifold , 1982 .

[36]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[37]  P. Absil,et al.  Erratum to: ``Global rates of convergence for nonconvex optimization on manifolds'' , 2016, IMA Journal of Numerical Analysis.

[38]  Benedikt Wirth,et al.  Optimization Methods on Riemannian Manifolds and Their Application to Shape Space , 2012, SIAM J. Optim..

[39]  Silvere Bonnabel,et al.  Stochastic Gradient Descent on Riemannian Manifolds , 2011, IEEE Transactions on Automatic Control.

[40]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[41]  Bart Vandereycken,et al.  Low-Rank Matrix Completion by Riemannian Optimization , 2013, SIAM J. Optim..

[42]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[43]  Wen Huang,et al.  A Broyden Class of Quasi-Newton Methods for Riemannian Optimization , 2015, SIAM J. Optim..

[44]  Bamdev Mishra,et al.  Riemannian Stochastic Recursive Gradient Algorithm , 2018, ICML 2018.

[45]  Hiroyuki Kasai,et al.  Low-rank tensor completion: a Riemannian manifold preconditioning approach , 2016, ICML.

[46]  Fatih Murat Porikli,et al.  Fast Construction of Covariance Matrices for Arbitrary Size Image Windows , 2006, 2006 International Conference on Image Processing.

[47]  Xuelong Li,et al.  Gabor-Based Region Covariance Matrices for Face Recognition , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  Jie Liu,et al.  SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.

[49]  Philippe L. Toint,et al.  Towards an efficient sparsity exploiting newton method for minimization , 1981 .

[50]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[51]  Silvere Bonnabel,et al.  Linear Regression under Fixed-Rank Constraints: A Riemannian Approach , 2011, ICML.