A hierarchical model for ordinal matrix factorization

This paper proposes a hierarchical probabilistic model for ordinal matrix factorization. Unlike previous approaches, we model the ordinal nature of the data and take a principled approach to incorporating priors for the hidden variables. Two algorithms are presented for inference, one based on Gibbs sampling and one based on variational Bayes. Importantly, these algorithms may be implemented in the factorization of very large matrices with missing entries.The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies. The Netflix data set is used for evaluation, which consists of around 100 million ratings. Using root mean-squared error (RMSE) as an evaluation metric, results show that the suggested model outperforms alternative factorization techniques. Results also show how Gibbs sampling outperforms variational Bayes on this task, despite the large number of ratings and model parameters. Matlab implementations of the proposed algorithms are available from cogsys.imm.dtu.dk/ordinalmatrixfactorization.

[1]  R. Kohli,et al.  Internet Recommendation Systems , 2000 .

[2]  Yihong Gong,et al.  Large-scale collaborative prediction using a nonparametric random effects model , 2009, ICML '09.

[3]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[4]  Benjamin M. Marlin,et al.  Modeling User Rating Profiles For Collaborative Filtering , 2003, NIPS.

[5]  Domonkos Tikk,et al.  Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..

[6]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[7]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[8]  Chris H. Q. Ding,et al.  Binary matrix factorization for analyzing gene expression data , 2009, Data Mining and Knowledge Discovery.

[9]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[10]  Yihong Gong,et al.  Stochastic Relational Models for Large-scale Dyadic Data using MCMC , 2008, NIPS.

[11]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[12]  Thore Graepel,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Statistical Methods Matchbox: Large Scale Online Bayesian Recommendations , 2022 .

[13]  Max Welling,et al.  Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures , 2010, AAAI.

[14]  G. B. Smith,et al.  Preface to S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images” , 1987 .

[15]  Yihong Gong,et al.  Fast nonparametric matrix factorization for large-scale collaborative filtering , 2009, SIGIR.

[16]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[17]  Yehuda Koren,et al.  Improved Neighborhood-based Collaborative Filtering , 2007 .

[18]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[19]  Yehuda Koren,et al.  The BellKor solution to the Netflix Prize , 2007 .

[20]  S S Stevens,et al.  On the Theory of Scales of Measurement. , 1946, Science.

[21]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[22]  Yehuda Koren,et al.  The BellKor Solution to the Netflix Grand Prize , 2009 .

[23]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[24]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[25]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[26]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[27]  Yee Whye Teh,et al.  Variational Bayesian Approach to Movie Rating Prediction , 2007, KDD 2007.

[28]  Pauli Miettinen,et al.  The Discrete Basis Problem , 2006, IEEE Transactions on Knowledge and Data Engineering.

[29]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[30]  Michael I. Jordan,et al.  Mixed Membership Matrix Factorization , 2010, ICML.

[31]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[32]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[33]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[34]  Jieping Ye,et al.  Mining discrete patterns via binary matrix factorization , 2009, KDD.

[35]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[36]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[37]  P. McCullagh,et al.  Generalized Linear Models , 1984 .