Balancing Prediction and Recommendation Accuracy: Hierarchical Latent Factors for Preference Data

Recent works in Recommender Systems (RS) have investigated the relationships between the prediction accuracy, i.e. the ability of a RS to minimize a cost function (for instance the RMSE measure) in estimating users’ preferences, and the accuracy of the recommendation list provided to users. State-of-the-art recommendation algorithms, which focus on the minimization of RMSE, have shown to achieve weak results from the recommendation accuracy perspective, and vice versa. In this work we present a novel Bayesian probabilistic hierarchical approach for users’ preference data, which is designed to overcome the limitation of current methodologies and thus to meet both prediction and recommendation accuracy. According to the generative semantics of this technique, each user is modeled as a random mixture over latent factors, which identify users community interests. Each individual user community is then modeled as a mixture of topics, which capture the preferences of the members on a set of items. We provide two dierent formalization of the basic hierarchical model: BH-Forced focuses on rating prediction, while BH-Free models both the popularity of items and the distribution over item ratings. The combined modeling of item popularity and rating provides a powerful framework for the generation of highly accurate recommendations. An extensive evaluation over two popular benchmark datasets reveals the eectiveness and the quality of the proposed algorithms, showing that BH-Free realizes the most satisfactory compromise between prediction and recommendation accuracy with respect to several stateof-the-art competitors.

[1]  Michael R. Lyu,et al.  SoRec: social recommendation using probabilistic matrix factorization , 2008, CIKM '08.

[2]  Nicola Barbieri,et al.  Regularized Gibbs Sampling for User Profiling with Soft Constraints , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[3]  Nicola Barbieri,et al.  An Analysis of Probabilistic Methods for Top-N Recommendation in Collaborative Filtering , 2011, ECML/PKDD.

[4]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[5]  Arindam Banerjee,et al.  Bayesian Co-clustering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[6]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Luo Si,et al.  A study of mixture models for collaborative filtering , 2006, Information Retrieval.

[9]  Max Welling,et al.  Multi-HDP: A Non Parametric Bayesian Model for Tensor Factorization , 2008, AAAI.

[10]  Sean M. McNee,et al.  Being accurate is not enough: how accuracy metrics have hurt recommender systems , 2006, CHI Extended Abstracts.

[11]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[12]  Nicola Barbieri,et al.  A Block Mixture Model for Pattern Discovery in Preference Data , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[13]  Luo Si,et al.  Flexible Mixture Model for Collaborative Filtering , 2003, ICML.

[14]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[15]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[16]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[17]  Kathryn B. Laskey,et al.  Latent Dirichlet Bayesian Co-Clustering , 2009, ECML/PKDD.

[18]  Nicola Barbieri,et al.  Modeling item selection and relevance for accurate recommendations: a bayesian approach , 2011, RecSys '11.

[19]  Srujana Merugu,et al.  A scalable collaborative filtering framework based on co-clustering , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[20]  Lior Rokach,et al.  Recommender Systems Handbook , 2010 .

[21]  Benjamin M. Marlin,et al.  Modeling User Rating Profiles For Collaborative Filtering , 2003, NIPS.

[22]  Thomas Hofmann,et al.  Collaborative filtering via gaussian probabilistic latent semantic analysis , 2003, SIGIR.

[23]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[24]  Thomas Hofmann,et al.  Latent Class Models for Collaborative Filtering , 1999, IJCAI.

[25]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.