Probabilistic Matrix Factorization with Non-random Missing Data

We propose a probabilistic matrix factorization model for collaborative filtering that learns from data that is missing not at random (MNAR). Matrix factorization models exhibit state-of-the-art predictive performance in collaborative filtering. However, these models usually assume that the data is missing at random (MAR), and this is rarely the case. For example, the data is not MAR if users rate items they like more than ones they dislike. When the MAR assumption is incorrect, inferences are biased and predictive performance can suffer. Therefore, we model both the generative process for the data and the missing data mechanism. By learning these two models jointly we obtain improved performance over state-of-the-art methods when predicting the ratings and when modeling the data observation process. We present the first viable MF model for MNAR data. Our results are promising and we expect that further research on NMAR models will yield large gains in collaborative filtering.

[1]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[2]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[3]  Michael I. Jordan,et al.  A Variational Approach to Bayesian Logistic Regression Models and their Extensions , 1997, AISTATS.

[4]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[5]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[6]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[7]  H. Robbins A Stochastic Approximation Method , 1951 .

[8]  Yew Jin Lim Variational Bayesian Approach to Movie Rating Prediction , 2007 .

[9]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[10]  Richard S. Zemel,et al.  Collaborative Filtering and the Missing at Random Assumption , 2007, UAI.

[11]  Guy Shani,et al.  A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..

[12]  Thore Graepel,et al.  Matchbox: large scale online bayesian recommendations , 2009, WWW '09.

[13]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[14]  Richard S. Zemel,et al.  Collaborative prediction and ranking with non-random missing data , 2009, RecSys '09.

[15]  Harald Steck,et al.  Training and testing of recommender systems on data missing not at random , 2010, KDD.

[16]  Guillaume Bouchard,et al.  Robust Bayesian Matrix Factorisation , 2011, AISTATS.

[17]  Ole Winther,et al.  A hierarchical model for ordinal matrix factorization , 2012, Stat. Comput..

[18]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[19]  Ulrich Paquet,et al.  One-class collaborative filtering with random graphs , 2013, WWW '13.