Scalable Variational Bayesian Matrix Factorization with Side Information

Bayesian matrix factorization (BMF) is a popular method for collaborative prediction, because of its robustness to overfitting as well as of being free from cross-validation for fine tuning of regularization parameters. In practice, however, due to its cubic time complexity with respect to the rank of factor matrices, existing variational inference algorithms for BMF are not well suited to web-scale datasets where billions of ratings provided by millions of users are available. The time complexity even increases when the side information, such as user binary implicit feedback or item content information, is incorporated into variational Bayesian matrix factorization (VBMF). For instance, a state of the arts in VBMF with side information, is to place Gaussian priors on user and item factor matrices, where mean of each prior is regressed on the corresponding side information. Since this approach introduces additional cubic time complexity with respect to the size of feature vectors, the use of rich side information in a form of highdimensional feature vector is prohibited. In this paper, we present a scalable inference for VBMF with side information, the complexity of which is linear in the rank K of factor matrices. Moreover, the algorithm can be easily parallelized on multi-core systems. Experiments on large-scale datasets demonstrate the useful behavior of our algorithm such as scalability, fast learning, and prediction accuracy. Appearing in Proceedings of the 17 International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. JMLR: W&CP volume 33. Copyright 2014 by the authors.

[1]  Seungjin Choi,et al.  Variational Bayesian View of Weighted Trace Norm Regularization for Matrix Factorization , 2013, IEEE Signal Processing Letters.

[2]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[3]  Qiang Yang,et al.  General Functional Matrix Factorization Using Gradient Boosting , 2013, ICML.

[4]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[5]  Yehuda Koren,et al.  The Yahoo! Music Dataset and KDD-Cup '11 , 2012, KDD Cup.

[6]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[7]  Sunho Park,et al.  Hierarchical Bayesian Matrix Factorization with Side Information , 2013, IJCAI.

[8]  Volker Markl,et al.  Distributed matrix factorization with mapreduce using a series of broadcast-joins , 2013, RecSys.

[9]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[10]  Seungjin Choi,et al.  Bayesian Matrix Co-Factorization: Variational Algorithm and Cramér-Rao Bound , 2011, ECML/PKDD.

[11]  Martin Ester,et al.  TrustWalker: a random walk model for combining trust-based and item-based recommendation , 2009, KDD.

[12]  Thore Graepel,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Statistical Methods Matchbox: Large Scale Online Bayesian Recommendations , 2022 .

[13]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[14]  Ulrich Paquet,et al.  Xbox movies recommendations: variational bayes matrix factorization with embedded feature selection , 2013, RecSys.

[15]  Max Welling,et al.  Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures , 2010, AAAI.

[16]  Shinichi Nakajima,et al.  Global Solution of Fully-Observed Variational Bayesian Matrix Factorization is Column-Wise Independent , 2011, NIPS.

[17]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[18]  Seungjin Choi,et al.  Weighted nonnegative matrix factorization , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Michael R. Lyu,et al.  SoRec: social recommendation using probabilistic matrix factorization , 2008, CIKM '08.

[20]  Peter J. Haas,et al.  Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.

[21]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[22]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[23]  Domonkos Tikk,et al.  Fast als-based matrix factorization for explicit and implicit feedback datasets , 2010, RecSys '10.

[24]  Seungjin Choi,et al.  Hierarchical variational Bayesian matrix co-factorization , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Juha Karhunen,et al.  Principal Component Analysis for Large Scale Problems with Lots of Missing Values , 2007, ECML.

[26]  Inderjit S. Dhillon,et al.  Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems , 2012, 2012 IEEE 12th International Conference on Data Mining.

[27]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[28]  Yew Jin Lim Variational Bayesian Approach to Movie Rating Prediction , 2007 .

[29]  Geoffrey J. Gordon,et al.  A Bayesian Matrix Factorization Model for Relational Data , 2010, UAI.

[30]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[31]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[32]  Chih-Jen Lin,et al.  A fast parallel SGD for matrix factorization in shared memory systems , 2013, RecSys.

[33]  Domonkos Tikk,et al.  Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..

[34]  Guillaume Bouchard,et al.  Convex Collective Matrix Factorization , 2013, AISTATS.

[35]  Yehuda Koren,et al.  Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[36]  Michael R. Lyu,et al.  Learning to recommend with social trust ensemble , 2009, SIGIR.