Bayesian binomial mixture model for collaborative prediction with non-random missing data

Collaborative prediction involves filling in missing entries of a user-item matrix to predict preferences of users based on their observed preferences. Most of existing models assume that the data is missing at random (MAR), which is often violated in recommender systems in practice. Incorrect assumption on missing data ignores the missing data mechanism, leading to biased inferences and prediction. In this paper we present a Bayesian binomial mixture model for collaborative prediction, where the generative process for data and missing data mechanism are jointly modeled to handle non-random missing data. Missing data mechanism is modeled by three factors, each of which is related to users, items, and rating values. Each factor is modeled by Bernoulli random variable, and the observation of rating value is determined by the Boolean OR operation of three binary variables. We develop computationally-efficient variational inference algorithms, where variational parameters have closed-form update rules and the computational complexity depends on the number of observed ratings, instead of the size of the rating data matrix. We also discuss implementation issues on hyperparameter tuning and estimation based on empirical Bayes. Experiments on Yahoo! Music and MovieLens datasets confirm the useful behavior of our model by demonstrating that: (1) it outperforms state-of-the-art methods in yielding higher predictive performance; (2) it finds meaningful solutions instead of undesirable boundary solutions; (3) it provides rating trend analysis on why ratings are observed.

[1]  Seungjin Choi,et al.  Scalable Variational Bayesian Matrix Factorization with Side Information , 2014, AISTATS.

[2]  Jinlong Wu Binomial Matrix Factorization for Discrete Collaborative Filtering , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[3]  Richard S. Zemel,et al.  Collaborative prediction and ranking with non-random missing data , 2009, RecSys '09.

[4]  Ruslan Salakhutdinov,et al.  Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm , 2010, NIPS.

[5]  Seungjin Choi,et al.  Variational Bayesian View of Weighted Trace Norm Regularization for Matrix Factorization , 2013, IEEE Signal Processing Letters.

[6]  Harald Steck,et al.  Training and testing of recommender systems on data missing not at random , 2010, KDD.

[7]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[8]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[9]  Sunho Park,et al.  Hierarchical Bayesian Matrix Factorization with Side Information , 2013, IJCAI.

[10]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[11]  Yehuda Koren,et al.  The Yahoo! Music Dataset and KDD-Cup '11 , 2012, KDD Cup.

[12]  Seungjin Choi,et al.  Bayesian Matrix Co-Factorization: Variational Algorithm and Cramér-Rao Bound , 2011, ECML/PKDD.

[13]  Michael R. Lyu,et al.  Response Aware Model-Based Collaborative Filtering , 2012, UAI.

[14]  T. Minka Estimating a Dirichlet distribution , 2012 .

[15]  Benjamin M. Marlin,et al.  Missing Data Problems in Machine Learning , 2008 .

[16]  Juha Karhunen,et al.  Principal Component Analysis for Large Scale Problems with Lots of Missing Values , 2007, ECML.

[17]  Yew Jin Lim Variational Bayesian Approach to Movie Rating Prediction , 2007 .

[18]  Ulrich Paquet,et al.  One-class collaborative filtering with random graphs , 2013, WWW.

[19]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[20]  Richard S. Zemel,et al.  Collaborative Filtering and the Missing at Random Assumption , 2007, UAI.

[21]  Benjamin M. Marlin,et al.  Modeling User Rating Profiles For Collaborative Filtering , 2003, NIPS.