Matrix Completion under Gaussian Models Using MAP and EM Algorithms

Completing a partially-known matrix (matrix completion) is an important problem in the field of data mining and signal processing, and has been successfully applied to sensor localization and recommendation system. Low-rank and factorization models are the two most popular and successful classes of models used for matrix completion. In this paper, we investigate another approach based on statistical estimation which has previously been used for matrix completion. In an initial work involving Gaussian models (GM), the formulation was inaccurate necessitating an ad-hoc empirical diagonal loading to a covariance matrix, requiring additional tuning, and making the final estimate of model parameters difficult to interpret. An accurate formulation using a correct objective function based on likelihood estimation already exists in statistical literature, which we utilize here to learn the model parameters using an Expectation Maximization (EM) algorithm. This approach no longer needs tuning and performs better in the numerical experiments. Owing to the difference that stems from the difference in choice of objective function, we note that the original method leads to an underestimated covariance matrix necessitating an artificial diagonal loading, while the method we use provides a Maximum Likelihood (ML) estimate of the model parameters. We also validate our approach using realworld data from MovieLens, EachMovie and Netflix.

[1]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[2]  Dianne P. O'Leary,et al.  Euclidean distance matrix completion problems , 2012, Optim. Methods Softw..

[3]  Gang Wu,et al.  Online video session progress prediction using low-rank matrix completion , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[4]  Shou-De Lin,et al.  A Linear Ensemble of Individual and Blended Models for Music Rating Prediction , 2012, KDD Cup.

[5]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[6]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[7]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[8]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[9]  Yinyu Ye,et al.  Semidefinite programming based algorithms for sensor network localization , 2006, TOSN.

[10]  Guillermo Sapiro,et al.  Efficient matrix completion with Gaussian models , 2010, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Guillermo Sapiro,et al.  Compressive Sensing by Learning a Gaussian Mixture Model From Measurements , 2015, IEEE Transactions on Image Processing.

[12]  Yu He,et al.  Statistical Significance of the Netflix Challenge , 2012, 1207.5649.

[13]  Hongyu Zhao,et al.  Low-Rank Modeling and Its Applications in Image Analysis , 2014, ACM Comput. Surv..

[14]  Stéphane Mallat,et al.  Solving Inverse Problems With Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity , 2010, IEEE Transactions on Image Processing.

[15]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[16]  Yehuda Koren,et al.  Lessons from the Netflix prize challenge , 2007, SKDD.

[17]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[18]  Christopher Ré,et al.  Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Mathematical Programming Computation.

[19]  Steffen Rendle,et al.  Factorization Machines with libFM , 2012, TIST.

[20]  L. Carin,et al.  Nonparametric Bayesian matrix completion , 2010, 2010 IEEE Sensor Array and Multichannel Signal Processing Workshop.