Matrix Completion With Covariate Information

ABSTRACT This article investigates the problem of matrix completion from the corrupted data, when the additional covariates are available. Despite being seldomly considered in the matrix completion literature, these covariates often provide valuable information for completing the unobserved entries of the high-dimensional target matrix A0. Given a covariate matrix X with its rows representing the row covariates of A0, we consider a column-space-decomposition model A0 = Xβ0 + B0, where β0 is a coefficient matrix and B0 is a low-rank matrix orthogonal to X in terms of column space. This model facilitates a clear separation between the interpretable covariate effects (Xβ0) and the flexible hidden factor effects (B0). Besides, our work allows the probabilities of observation to depend on the covariate matrix, and hence a missing-at-random mechanism is permitted. We propose a novel penalized estimator for A0 by utilizing both Frobenius-norm and nuclear-norm regularizations with an efficient and scalable algorithm. Asymptotic convergence rates of the proposed estimators are studied. The empirical performance of the proposed methodology is illustrated via both numerical experiments and a real data application.

[1]  T. Sweeting Uniform Asymptotic Normality of the Maximum Likelihood Estimator , 1980 .

[2]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[3]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[4]  David A. Freedman,et al.  Statistical Models: Theory and Practice: References , 2005 .

[5]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[6]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[7]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[8]  Francis R. Bach,et al.  A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..

[9]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[10]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[11]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[12]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[13]  Ruslan Salakhutdinov,et al.  Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm , 2010, NIPS.

[14]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[15]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[16]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[17]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[18]  Cun-Hui Zhang,et al.  Calibrated Elastic Regularization in Matrix Completion , 2012, NIPS.

[19]  Yu He,et al.  Statistical Significance of the Netflix Challenge , 2012, 1207.5649.

[20]  Martin J. Wainwright,et al.  Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[21]  T. Tony Cai,et al.  Matrix completion via max-norm constrained optimization , 2013, ArXiv.

[22]  Miao Xu,et al.  Speedup Matrix Completion with Side Information: Application to Multi-Label Learning , 2013, NIPS.

[23]  Nagarajan Natarajan,et al.  Inductive matrix completion for predicting gene–disease associations , 2014, Bioinform..

[24]  O. Klopp Noisy low-rank matrix completion with general sampling distribution , 2012, 1203.0108.

[25]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[26]  Yasuyuki Matsushita,et al.  Fast randomized Singular Value Thresholding for Nuclear Norm Minimization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Inderjit S. Dhillon,et al.  Matrix Completion with Noisy Side Information , 2015, NIPS.

[28]  Xiaotong Shen,et al.  Personalized Prediction and Sparsity Pursuit in Latent Factor Models , 2016 .

[29]  Anru Zhang,et al.  Structured Matrix Completion with Applications to Genomic Data Integration , 2015, Journal of the American Statistical Association.

[30]  Junhui Wang,et al.  A Group-Specific Recommender System , 2017 .