CUR Algorithm for Partially Observed Matrices

CUR matrix decomposition computes the low rank approximation of a given matrix by using the actual rows and columns of the matrix. It has been a very useful tool for handling large matrices. One limitation with the existing algorithms for CUR matrix decomposition is that they cannot deal with entries in a partially observed matrix, while incomplete matrices are found in many real world applications. In this work, we alleviate this limitation by developing a CUR decomposition algorithm for partially observed matrices. In particular, the proposed algorithm computes the low rank approximation of the target matrix based on (i) the randomly sampled rows and columns, and (ii) a subset of observed entries that are randomly sampled from the matrix. Our analysis shows the error bound, measured by spectral norm, for the proposed algorithm when the target matrix is of full rank. We also show that only O(nr ln r) observed entries are needed by the proposed algorithm to perfectly recover a rank r matrix of size n × n, which improves the sample complexity of the existing algorithms for matrix completion. Empirical studies on both synthetic and real-world datasets verify our theoretical claims and demonstrate the effectiveness of the proposed algorithm.

[1]  Christos Boutsidis,et al.  Optimal CUR matrix decompositions , 2014, SIAM J. Comput..

[2]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[3]  Prateek Jain,et al.  Universal Matrix Completion , 2014, ICML.

[4]  Yudong Chen,et al.  Coherent Matrix Completion , 2013, ICML.

[5]  Ieee Staff 2013 IEEE 54th Annual Symposium on Foundations of Computer Science (FOCS) , 2013 .

[6]  Miao Xu,et al.  Speedup Matrix Completion with Side Information: Application to Multi-Label Learning , 2013, NIPS.

[7]  Akshay Krishnamurthy,et al.  Low-Rank Matrix and Tensor Completion via Adaptive Sampling , 2013, NIPS.

[8]  Zhihua Zhang,et al.  Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling , 2013, J. Mach. Learn. Res..

[9]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[10]  Rong Jin,et al.  Improved Bounds for the Nyström Method With Application to Kernel Classification , 2011, IEEE Transactions on Information Theory.

[11]  Shusen Wang,et al.  A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound , 2012, NIPS.

[12]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[13]  Robert D. Nowak,et al.  High-Rank Matrix Completion and Subspace Clustering with Missing Data , 2011, ArXiv.

[14]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[15]  Ameet Talwalkar,et al.  Divide-and-Conquer Matrix Factorization , 2011, NIPS.

[16]  Christos Boutsidis,et al.  Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[17]  Joel A. Tropp,et al.  Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[18]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[19]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[20]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[21]  V. Koltchinskii Low Rank Matrix Recovery: Nuclear Norm Penalization , 2011 .

[22]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[23]  Robert D. Nowak,et al.  Transduction with Matrix Completion: Three Birds with One Stone , 2010, NIPS.

[24]  Michael W. Mahoney,et al.  CUR from a Sparse Optimization Viewpoint , 2010, NIPS.

[25]  Luis Rademacher,et al.  Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[26]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[27]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[28]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[29]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[30]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[31]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[32]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[33]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[34]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[35]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[36]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[37]  Francis R. Bach,et al.  Consistency of trace norm minimization , 2007, J. Mach. Learn. Res..

[38]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[39]  Petros Drineas,et al.  Tensor-CUR decompositions for tensor-based data , 2006, KDD '06.

[40]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES III: COMPUTING A COMPRESSED APPROXIMATE MATRIX DECOMPOSITION∗ , 2004 .

[41]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[42]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[43]  Eugene E. Tyrtyshnikov,et al.  Incomplete Cross Approximation in the Mosaic-Skeleton Method , 2000, Computing.

[44]  Christopher K. I. Williams,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[45]  G. W. Stewart,et al.  Four algorithms for the the efficient computation of truncated pivoted QR approximations to a sparse matrix , 1999, Numerische Mathematik.

[46]  Ren-Cang Li,et al.  Relative Perturbation Theory: II. Eigenspace and Singular Subspace Variations , 1996, SIAM J. Matrix Anal. Appl..

[47]  S. Goreinov,et al.  Pseudo-skeleton approximations by matrices of maximal volume , 1997 .

[48]  S. Goreinov,et al.  A Theory of Pseudoskeleton Approximations , 1997 .

[49]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[50]  Severnyi Kavkaz Pseudo-Skeleton Approximations by Matrices of Maximal Volume , 2022 .