Parallel matrix factorization for binary response

Predicting user affinity to items is an important problem in applications like content optimization, computational advertising, among others. While matrix factorization methods provide state-of-the-art performance when minimizing RMSE through a Gaussian response model on explicit ratings data, applying it to imbalanced binary response data presents additional challenges that we carefully study in this paper. Data in many applications usually consist of users' implicit response that is binary - clicking an item or not; the goal is to predict click rates (i.e., probabilities), which are often combined with other measures of utilities to rank items at runtime. Because of the implicit nature, such data is usually much larger than explicit rating data but often has an imbalanced distribution with a small fraction of click events, making accurate click rate prediction difficult. In this paper, we address two problems. First, we show previous techniques to estimate factor models with binary data are less accurate compared to our new approach based on adaptive rejection sampling, especially for imbalanced response. Second, we develop a parallel matrix factorization framework using Map-Reduce that scales to massive datasets. Our parallel algorithm is based on a “divide and conquer” strategy coupled with an ensemble approach. Through experiments on two benchmark data sets and a large Yahoo! Front Page Today Module data set that contains 8M users and 1B binary observations, we show that careful handling of binary response is needed to achieve good performance for click rate prediction, and that the proposed adaptive rejection sampler and the partitioning and ensemble techniques significantly improve performance.

[1]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[2]  W. Gilks,et al.  Adaptive Rejection Metropolis Sampling Within Gibbs Sampling , 1995 .

[3]  Deepak Agarwal,et al.  Online Models for Content Optimization , 2008, NIPS.

[4]  Lawrence K. Saul,et al.  A Generalized Linear Model for Principal Component Analysis of Binary Data , 2003, AISTATS.

[5]  Rutger van Haasteren,et al.  Gibbs Sampling , 2010, Encyclopedia of Machine Learning.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[8]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[9]  Deepak Agarwal,et al.  Fast online learning through offline initialization for time-sensitive recommendation , 2010, KDD.

[10]  Nathaniel Good,et al.  Naïve filterbots for robust cold-start recommendations , 2006, KDD '06.

[11]  John Riedl,et al.  Combining Collaborative Filtering with Personal Agents for Better Recommendations , 1999, AAAI/IAAI.

[12]  Mark Claypool,et al.  Combining Content-Based and Collaborative Filters in an Online Newspaper , 1999, SIGIR 1999.

[13]  Deepak Agarwal,et al.  Generalizing matrix factorization through flexible regression priors , 2011, RecSys '11.

[14]  Annapaola Marconi,et al.  Vibes: A Platform-Centric Approach to Building Recommender Systems. , 2008 .

[15]  Raymond J. Mooney,et al.  Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.

[16]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[17]  Andrei Z. Broder,et al.  Computational advertising and recommender systems , 2008, RecSys '08.

[18]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[19]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[20]  Biswadeep Nag Vibes: A Platform-Centric Approach to Building Recommender Systems , 2008, IEEE Data Eng. Bull..

[21]  Deepak Agarwal,et al.  Fast Computation of Posterior Mode in Multi-Level Hierarchical Models , 2008, NIPS.

[22]  Dennis DeCoste,et al.  Collaborative prediction using ensembles of Maximum Margin Matrix Factorizations , 2006, ICML.

[23]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[24]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[25]  James Bennett,et al.  The Netflix Prize , 2007 .

[26]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[27]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[28]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[29]  Liang Zhang,et al.  MODELING ITEM-ITEM SIMILARITIES FOR PERSONALIZED RECOMMENDATIONS ON YAHOO! FRONT PAGE , 2011, 1111.0416.

[30]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[31]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[32]  By W. R. GILKSt,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 2010 .

[33]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[34]  Thore Graepel,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Statistical Methods Matchbox: Large Scale Online Bayesian Recommendations , 2022 .

[35]  Art B. Owen,et al.  Infinitely Imbalanced Logistic Regression , 2007, J. Mach. Learn. Res..

[36]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[37]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[38]  Yehuda Koren,et al.  Modeling relationships at multiple scales to improve accuracy of large recommender systems , 2007, KDD '07.

[39]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.