A Universal Variance Reduction-Based Catalyst for Nonconvex Low-Rank Matrix Recovery

We propose a generic framework based on a new stochastic variance-reduced gradient descent algorithm for accelerating nonconvex low-rank matrix recovery. Starting from an appropriate initial estimator, our proposed algorithm performs projected gradient descent based on a novel semi-stochastic gradient specifically designed for low-rank matrix recovery. Based upon the mild restricted strong convexity and smoothness conditions, we derive a projected notion of the restricted Lipschitz continuous gradient property, and prove that our algorithm enjoys linear convergence rate to the unknown low-rank matrix with an improved computational complexity. Moreover, our algorithm can be employed to both noiseless and noisy observations, where the optimal sample complexity and the minimax optimal statistical rate can be attained respectively. We further illustrate the superiority of our generic framework through several specific examples, both theoretically and experimentally.

[1]  Prasad Raghavendra,et al.  Computational Limits for Matrix Completion , 2014, COLT.

[2]  Sham M. Kakade,et al.  Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent , 2016, NIPS.

[3]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[4]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[5]  Ewout van den Berg,et al.  1-Bit Matrix Completion , 2012, ArXiv.

[6]  Peter Richtárik,et al.  Semi-Stochastic Gradient Descent Methods , 2013, Front. Appl. Math. Stat..

[7]  Mark W. Schmidt,et al.  Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.

[8]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[9]  O. Klopp Noisy low-rank matrix completion with general sampling distribution , 2012, 1203.0108.

[10]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[11]  Elad Hazan,et al.  Fast and Simple PCA via Convex Optimization , 2015, ArXiv.

[12]  Ohad Shamir,et al.  Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity , 2015, ICML.

[13]  Alexander J. Smola,et al.  Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.

[14]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[15]  Justin Domke,et al.  Finito: A faster, permutable incremental gradient method for big data problems , 2014, ICML.

[16]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[17]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[18]  Alexandre Bernardino,et al.  Matrix Completion for Multi-label Image Classification , 2011, NIPS.

[19]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[20]  Guanghui Lan,et al.  An optimal method for stochastic composite optimization , 2011, Mathematical Programming.

[21]  Zeyuan Allen Zhu,et al.  Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.

[22]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[23]  Miao Xu,et al.  Speedup Matrix Completion with Side Information: Application to Multi-Label Learning , 2013, NIPS.

[24]  Lin Xiao,et al.  A Proximal Stochastic Gradient Method with Progressive Variance Reduction , 2014, SIAM J. Optim..

[25]  Adel Javanmard,et al.  1-bit matrix completion under exact low-rank constraint , 2015, 2015 49th Annual Conference on Information Sciences and Systems (CISS).

[26]  Ohad Shamir,et al.  A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate , 2014, ICML.

[27]  Zhi-Quan Luo,et al.  Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.

[28]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[29]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[30]  Anastasios Kyrillidis,et al.  Provable Burer-Monteiro factorization for a class of norm-constrained matrix problems , 2016 .

[31]  Jinghui Chen,et al.  Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization , 2016, UAI.

[32]  Anastasios Kyrillidis,et al.  Finding Low-rank Solutions to Matrix Problems, Efficiently and Provably , 2016, SIAM J. Imaging Sci..

[33]  Martin J. Wainwright,et al.  Fast global convergence rates of gradient methods for high-dimensional statistical recovery , 2010, NIPS.

[34]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[35]  Jie Liu,et al.  Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting , 2015, IEEE Journal of Selected Topics in Signal Processing.

[36]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[37]  Po-Ling Loh,et al.  Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..

[38]  Jiawei Han,et al.  Towards Faster Rates and Oracle Property for Low-Rank Matrix Estimation , 2016, ICML.

[39]  Zhaoran Wang,et al.  A Nonconvex Optimization Framework for Low Rank Matrix Estimation , 2015, NIPS.

[40]  Quanquan Gu,et al.  Optimal Statistical and Computational Rates for One Bit Matrix Completion , 2016, AISTATS.

[41]  Wen-Xin Zhou,et al.  A max-norm constrained minimization approach to 1-bit matrix completion , 2013, J. Mach. Learn. Res..

[42]  Christopher De Sa,et al.  Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems , 2014, ICML.

[43]  Martin J. Wainwright,et al.  Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[44]  Xiao Zhang,et al.  A Unified Computational and Statistical Framework for Nonconvex Low-rank Matrix Estimation , 2016, AISTATS.

[45]  Martin J. Wainwright,et al.  Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.

[46]  Max Simchowitz,et al.  Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.

[47]  John D. Lafferty,et al.  A Convergent Gradient Descent Algorithm for Rank Minimization and Semidefinite Programming from Random Linear Measurements , 2015, NIPS.

[48]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[49]  Sujay Sanghavi,et al.  Completing any low-rank matrix, provably , 2013, J. Mach. Learn. Res..

[50]  Mary Wootters,et al.  Fast matrix completion without the condition number , 2014, COLT.

[51]  Moritz Hardt,et al.  Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[52]  Sham M. Kakade,et al.  Faster Eigenvector Computation via Shift-and-Invert Preconditioning , 2016, ICML.

[53]  Xiao Zhang,et al.  Stochastic Variance-reduced Gradient Descent for Low-rank Matrix Recovery from Linear Measurements , 2017, 1701.00481.

[54]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[55]  Julien Mairal,et al.  Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning , 2014, SIAM J. Optim..

[56]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[57]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[58]  Prateek Jain,et al.  Fast Exact Matrix Completion with Finite Samples , 2014, COLT.

[59]  Anastasios Kyrillidis,et al.  Dropping Convexity for Faster Semi-definite Optimization , 2015, COLT.

[60]  Jarvis D. Haupt,et al.  Nonconvex Sparse Learning via Stochastic Optimization with Progressive Variance Reduction , 2016 .

[61]  John D. Lafferty,et al.  Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent , 2016, ArXiv.

[62]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[63]  Pradeep Ravikumar,et al.  Exponential Family Matrix Completion under Structural Constraints , 2014, ICML.