Matrix completion with nonconvex regularization: spectral operators and scalable algorithms

In this paper, we study the popularly dubbed matrix completion problem, where the task is to “fill in” the unobserved entries of a matrix from a small subset of observed entries, under the assumption that the underlying matrix is of low rank. Our contributions herein enhance our prior work on nuclear norm regularized problems for matrix completion (Mazumder et al. in J Mach Learn Res 1532(11):2287–2322, 2010) by incorporating a continuum of nonconvex penalty functions between the convex nuclear norm and nonconvex rank functions. Inspired by Soft-Impute  (Mazumder et al. 2010; Hastie et al. in J Mach Learn Res, 2016), we propose NC-Impute —an EM-flavored algorithmic framework for computing a family of nonconvex penalized matrix completion problems with warm starts. We present a systematic study of the associated spectral thresholding operators, which play an important role in the overall algorithm. We study convergence properties of the algorithm. Using structured low-rank SVD computations, we demonstrate the computational scalability of our proposal for problems up to the Netflix size (approximately, a 500,000  $$\times $$ ×  20,000 matrix with $$10^8$$ 10 8 observed entries). We demonstrate that on a wide range of synthetic and real data instances, our proposed nonconvex regularization framework leads to low-rank solutions with better predictive performance when compared to those obtained from nuclear norm problems. Implementations of algorithms proposed herein, written in the R language, are made available on github .

[1]  Adrian S. Lewis,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[2]  Tong Zhang,et al.  A General Theory of Concave Regularization for High-Dimensional Sparse Estimation Problems , 2011, 1108.4988.

[3]  Rahul Mazumder,et al.  The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization , 2015, IEEE Transactions on Information Theory.

[4]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[5]  Yudong Chen,et al.  Incoherence-Optimal Matrix Completion , 2013, IEEE Transactions on Information Theory.

[6]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[7]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[8]  Arian Maleki,et al.  Overcoming The Limitations of Phase Transition by Higher Order Analysis of Regularization Techniques , 2016, The Annals of Statistics.

[9]  Tengyu Ma,et al.  Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[10]  Ji Chen,et al.  Nonconvex Rectangular Matrix Completion via Gradient Descent Without ℓ₂,∞ Regularization , 2020, IEEE Transactions on Information Theory.

[11]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[12]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[13]  Jinchi Lv,et al.  A unified approach to model selection and sparse recovery using regularized least squares , 2009, 0905.3573.

[14]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[15]  Po-Ling Loh,et al.  Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..

[16]  R. Mazumder,et al.  Computing the degrees of freedom of rank-regularized estimators and cousins , 2019, Electronic Journal of Statistics.

[17]  S. Mendelson,et al.  Regularization and the small-ball method I: sparse recovery , 2016, 1601.05584.

[18]  Martin J. Wainwright,et al.  Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[19]  Moritz Hardt,et al.  Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  Zhi-Quan Luo,et al.  Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.

[22]  Pierre Alquier,et al.  A Bayesian approach for noisy matrix completion: Optimal rate under general sampling distribution , 2014, 1408.5820.

[23]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[24]  Yuxin Chen,et al.  Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview , 2018, IEEE Transactions on Signal Processing.

[25]  John D. Lafferty,et al.  Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent , 2016, ArXiv.

[26]  C. Stein Estimation of the Mean of a Multivariate Normal Distribution , 1981 .

[27]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[28]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[29]  Paul Grigas,et al.  An Extended Frank-Wolfe Method with "In-Face" Directions, and Its Application to Low-Rank Matrix Completion , 2015, SIAM J. Optim..

[30]  O. Klopp Noisy low-rank matrix completion with general sampling distribution , 2012, 1203.0108.

[31]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[32]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[33]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[34]  Yi Zheng,et al.  No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[35]  Inderjit S. Dhillon,et al.  Guaranteed Rank Minimization via Singular Value Projection , 2009, NIPS.

[36]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[37]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[38]  Hussein Hazimeh,et al.  Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms , 2018, Oper. Res..

[39]  Emmanuel J. Candès,et al.  Unbiased Risk Estimates for Singular Value Thresholding and Spectral Estimators , 2012, IEEE Transactions on Signal Processing.

[40]  Mila Nikolova,et al.  Local Strong Homogeneity of a Regularized Estimator , 2000, SIAM J. Appl. Math..

[41]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[42]  M. Fazel,et al.  Reweighted nuclear norm minimization with application to system identification , 2010, Proceedings of the 2010 American Control Conference.

[43]  Cun-Hui Zhang,et al.  Sorted concave penalized regression , 2017, The Annals of Statistics.

[44]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[45]  Maryam Fazel,et al.  Iterative reweighted algorithms for matrix rank minimization , 2012, J. Mach. Learn. Res..

[46]  A. Lewis The Convex Analysis of Unitarily Invariant Matrix Functions , 1995 .

[47]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[48]  H. Wold Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least Squares (NIPALS) Approach , 1975, Journal of Applied Probability.

[49]  Arian Maleki,et al.  Does $\ell _{p}$ -Minimization Outperform $\ell _{1}$ -Minimization? , 2015, IEEE Transactions on Information Theory.

[50]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization and Its Applications to Low Level Vision , 2016, International Journal of Computer Vision.

[51]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[52]  Yudong Chen,et al.  Coherent Matrix Completion , 2013, ICML.

[53]  Martin Jaggi,et al.  A Simple Algorithm for Nuclear Norm Regularized Problems , 2010, ICML.

[54]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[55]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[56]  Mary Wootters,et al.  Fast matrix completion without the condition number , 2014, COLT.

[57]  Arian Maleki,et al.  Which bridge estimator is optimal for variable selection? , 2017, ArXiv.

[58]  I. Daubechies,et al.  Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[59]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[60]  Trevor J. Hastie,et al.  Matrix completion and low-rank SVD via fast alternating least squares , 2014, J. Mach. Learn. Res..

[61]  Nathan Srebro,et al.  Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.

[62]  Alexander Shapiro,et al.  Matrix Completion With Deterministic Pattern: A Geometric Perspective , 2018, IEEE Transactions on Signal Processing.

[63]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[64]  T. Hastie,et al.  SparseNet: Coordinate Descent With Nonconvex Penalties , 2011, Journal of the American Statistical Association.

[65]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[66]  P. Radchenko,et al.  Subset Selection with Shrinkage: Sparse Linear Modeling When the SNR Is Low , 2017, Oper. Res..

[67]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[68]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[69]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[70]  Dima Grigoriev,et al.  Complexity of Quantifier Elimination in the Theory of Algebraically Closed Fields , 1984, MFCS.

[71]  A. Maleki,et al.  Which bridge estimator is the best for variable selection? , 2020 .

[72]  Yuxin Chen,et al.  Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution , 2017, Found. Comput. Math..

[73]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[74]  J. W. Silverstein,et al.  Spectral Analysis of Large Dimensional Random Matrices , 2009 .

[75]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[76]  Massimo Fornasier,et al.  Low-rank Matrix Recovery via Iteratively Reweighted Least Squares Minimization , 2010, SIAM J. Optim..