A Nesterov-type Acceleration with Adaptive Localized Cayley Parametrization for Optimization over the Stiefel Manifold

Despite certain singular-point issues, the Cayley parametrization (CP) has great potential to serve as a key to import many powerful strategies, developed originally for optimization over a vector space, into the task for optimization over the Stiefel manifold. In this paper, we newly present (i) a computationally efficient CP that can circumvent the singularpoint issues and (ii) a Nesterov type accelerated gradient method, based on the proposed CP, with its convergence analysis. To guarantee the convergence, we also evaluate a Lipschitz constant of the gradient of the cost function in the CP domain. Numerical experiments show excellent performance of the proposed accelerated algorithm compared with the standard algorithms, e.g., the Barzilai-Borwein method and L-BFGS method, combined with a vector transport for optimization over the Stiefel manifold as a special instance of the Riemannian manifold.

[1]  M. J. Pereira-S'aez,et al.  Cayley transform on Stiefel manifolds , 2016, 1612.07142.

[2]  I. Yamada,et al.  An orthogonal matrix optimization by Dual Cayley Parametrization Technique , 2003 .

[3]  Ralf Zimmermann,et al.  A Matrix-Algebraic Algorithm for the Riemannian Logarithm on the Stiefel Manifold under the Canonical Metric , 2016, SIAM J. Matrix Anal. Appl..

[4]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[5]  Keita Kume,et al.  Adaptive Localized Cayley Parametrization Technique for Smooth optimization over the Stiefel Manifold , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[6]  Saeed Ghadimi,et al.  Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[7]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[8]  Benedikt Wirth,et al.  Optimization Methods on Riemannian Manifolds and Their Application to Shape Space , 2012, SIAM J. Optim..

[9]  Toshihisa Tanaka,et al.  Learning on the compact Stiefel manifold by a cayley-transform-based pseudo-retraction map , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[10]  Xiaojing Zhu,et al.  A Riemannian conjugate gradient method for optimization on the Stiefel manifold , 2016, Computational Optimization and Applications.

[11]  Jonathan H. Manton,et al.  Algorithms on the Stiefel manifold for joint diagonalisation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Hiroyuki Sato,et al.  A new, globally convergent Riemannian conjugate gradient method , 2013, 1302.0125.

[13]  Fermín S. Viloche Bazán,et al.  Nonmonotone algorithm for minimization on closed sets with applications to minimization on Stiefel manifolds , 2012, J. Comput. Appl. Math..

[14]  Suvrit Sra,et al.  An Estimate Sequence for Geodesically Convex Optimization , 2018, COLT.

[15]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[16]  Wotao Yin,et al.  A feasible method for optimization with orthogonality constraints , 2013, Math. Program..

[17]  Visa Koivunen,et al.  Steepest Descent Algorithms for Optimization Under Unitary Matrix Constraint , 2008, IEEE Transactions on Signal Processing.

[18]  Hong Cheng,et al.  Accelerated First-order Methods for Geodesically Convex Optimization on Riemannian Manifolds , 2017, NIPS.

[19]  Qiang Ye,et al.  Orthogonal Recurrent Neural Networks with Scaled Cayley Transform , 2017, ICML.

[20]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[21]  Wen Huang,et al.  A Broyden Class of Quasi-Newton Methods for Riemannian Optimization , 2015, SIAM J. Optim..

[22]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[23]  Hiroyuki Kasai,et al.  Inexact trust-region algorithms on Riemannian manifolds , 2018, NeurIPS.

[24]  Emmanuel J. Candès,et al.  Adaptive Restart for Accelerated Gradient Schemes , 2012, Foundations of Computational Mathematics.

[25]  Jonathan W. Siegel Accelerated Optimization with Orthogonality Constraints , 2019, 1903.05204.

[26]  Paul Van Dooren,et al.  Optimization over the Stiefel manifold , 2007 .