论文信息 - Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization - 字舞流文

Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization

We study the low-rank matrix estimation problem, where the objective function L(M) is defined over the space of positive semidefinite matrices with rank less than or equal to r. A fast approach to solve this problem is matrix factorization, which reparameterizes M as the product of two smaller matrix such that M = UU> and then performs gradient descent on U directly, a.k.a., factored gradient descent. Since the resulting problem is nonconvex, whether Nesterov’s acceleration scheme can be adapted to it remains a long-standing question. In this paper, we answer this question affirmatively by proposing a novel and practical accelerated factored gradient descent method motivated by Nesterov’s accelerated gradient descent. The proposed method enjoys better iteration complexity and computational complexity than the state-ofthe-art algorithms in a wide regime. The key idea of our algorithm is to restrict all its iterates onto a special convex set, which enables the acceleration. Experimental results demonstrate the faster convergence of our algorithm and corroborate our theory.

Yuan Cao | Quanquan Gu | Dongruo Zhou | Quanquan Gu | Yuan Cao | Dongruo Zhou

[1] Prateek Jain,et al. Non-convex Optimization for Machine Learning , 2017, Found. Trends Mach. Learn..

[2] Martin J. Wainwright,et al. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[3] Martin J. Wainwright,et al. A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[4] Ren-Cang Li,et al. New Perturbation Bounds for the Unitary Polar Factor , 1995, SIAM J. Matrix Anal. Appl..

[5] Zaïd Harchaoui,et al. A Universal Catalyst for First-Order Optimization , 2015, NIPS.

[6] Max Simchowitz,et al. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.

[7] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[8] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[9] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[10] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[11] Alexandre Bernardino,et al. Matrix Completion for Multi-label Image Classification , 2011, NIPS.

[12] Xiao Zhang,et al. A Unified Framework for Nonconvex Low-Rank plus Sparse Matrix Recovery , 2018, AISTATS.

[13] Xiao Zhang,et al. A Unified Computational and Statistical Framework for Nonconvex Low-rank Matrix Estimation , 2016, AISTATS.

[14] JainPrateek,et al. Non-convex Optimization for Machine Learning , 2017 .

[15] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[16] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[17] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[18] Jorge Nocedal,et al. Representations of quasi-Newton matrices and their use in limited memory methods , 1994, Math. Program..

[19] Martin J. Wainwright,et al. Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.

[20] Tommi S. Jaakkola,et al. Maximum-Margin Matrix Factorization , 2004, NIPS.

[21] Huan Li,et al. Provable accelerated gradient method for nonconvex low rank optimization , 2017, Machine Learning.

[22] Zhihui Zhu,et al. Global Optimality in Low-Rank Matrix Optimization , 2017, IEEE Transactions on Signal Processing.

[23] Renato D. C. Monteiro,et al. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[24] Xiao Zhang,et al. A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery , 2017, ICML.

[25] John D. Lafferty,et al. Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent , 2016, ArXiv.

[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.

[28] Suvrit Sra,et al. An Estimate Sequence for Geodesically Convex Optimization , 2018, COLT.

[29] Anastasios Kyrillidis,et al. Finding Low-rank Solutions to Matrix Problems, Efficiently and Provably , 2016, SIAM J. Imaging Sci..

[30] Zhi-Quan Luo,et al. Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.

[31] Zhaoran Wang,et al. A Nonconvex Optimization Framework for Low Rank Matrix Estimation , 2015, NIPS.

[32] Anastasios Kyrillidis,et al. Dropping Convexity for Faster Semi-definite Optimization , 2015, COLT.

[33] Hong Cheng,et al. Accelerated First-order Methods for Geodesically Convex Optimization on Riemannian Manifolds , 2017, NIPS.

[34] Moritz Hardt,et al. Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[35] John D. Lafferty,et al. A Convergent Gradient Descent Algorithm for Rank Minimization and Semidefinite Programming from Random Linear Measurements , 2015, NIPS.

[36] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[37] Martin J. Wainwright,et al. Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[38] Yair Carmon,et al. Accelerated Methods for NonConvex Optimization , 2018, SIAM J. Optim..

[39] Moritz Hardt,et al. The Noisy Power Method: A Meta Algorithm with Applications , 2013, NIPS.

[40] Yuxin Chen,et al. Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval , 2018, Mathematical Programming.

[41] Jiawei Han,et al. Towards Faster Rates and Oracle Property for Low-Rank Matrix Estimation , 2016, ICML.