Combining Rating and Review Data by Initializing Latent Factor Models with Topic Models for Top-N Recommendation

Nowadays we commonly have multiple sources of data associated with items. Users may provide numerical ratings, or implicit interactions, but may also provide textual reviews. Although many algorithms have been proposed to jointly learn a model over both interactions and textual data, there is room to improve the many factorization models that are proven to work well on interactions data, but are not designed to exploit textual information. Our focus in this work is to propose a simple, yet easily applicable and effective, method to incorporate review data into such factorization models. In particular, we propose to build the user and item embeddings within the topic space of a topic model learned from the review data. This has several advantages: we observe that initializing the user and item embeddings in topic space leads to faster convergence of the factorization algorithm to a model that out-performs models initialized randomly, or with other state-of-the-art initialization strategies. Moreover, constraining user and item factors to topic space allows for the learning of an interpretable model that users can visualise.

[1]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[2]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  William W. Cohen,et al.  TransNets: Learning to Transform for Recommendation , 2017, RecSys.

[5]  Philip S. Yu,et al.  Leveraging Meta-path based Context for Top- N Recommendation with A Neural Co-Attention Model , 2018, KDD.

[6]  Christos Boutsidis,et al.  SVD based initialization: A head start for nonnegative matrix factorization , 2008, Pattern Recognit..

[7]  Mohan S. Kankanhalli,et al.  Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews , 2018, WWW.

[8]  Philip S. Yu,et al.  Explainable recommendation with fusion of aspect information , 2018, World Wide Web.

[9]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[10]  Yiqun Liu,et al.  Neural Attentional Rating Regression with Review-level Explanations , 2018, WWW.

[11]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[12]  Li Chen,et al.  Recommender systems based on user reviews: the state of the art , 2015, User Modeling and User-Adapted Interaction.

[13]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[14]  Michael Jahrer,et al.  Collaborative Filtering Ensemble for Ranking , 2012, KDD Cup.

[15]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[16]  Nicolas Gillis,et al.  Improved SVD-based Initialization for Nonnegative Matrix Factorization using Low-Rank Correction , 2018, Pattern Recognit. Lett..

[17]  Michael R. Lyu,et al.  Ratings meet reviews, a combined approach to recommend , 2014, RecSys '14.

[18]  Domonkos Tikk,et al.  Enhancing matrix factorization through initialization for implicit feedback databases , 2012, CaRR '12.

[19]  Derek Greene,et al.  Stability of topic modeling via matrix factorization , 2017, Expert Syst. Appl..

[20]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[21]  Costas Vassilakis,et al.  What makes a review a reliable rating in recommender systems? , 2020, Inf. Process. Manag..

[22]  Francisco J. Peña,et al.  Unsupervised Context-Driven Recommendations Based On User Reviews , 2017, RecSys.

[23]  Markus Schaal,et al.  Opinionated Product Recommendation , 2013, ICCBR.

[24]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[25]  Balázs Hidasi,et al.  Initializing Matrix Factorization Methods on Implicit Feedback Databases , 2013, J. Univers. Comput. Sci..

[26]  Behrouz Minaei-Bidgoli,et al.  Increasing prediction accuracy in collaborative filtering with initialized factor matrices , 2016, The Journal of Supercomputing.

[27]  Lei Zheng,et al.  Joint Deep Modeling of Users and Items Using Reviews for Recommendation , 2017, WSDM.

[28]  Xu Chen,et al.  Joint Representation Learning for Top-N Recommendation with Heterogeneous Information Sources , 2017, CIKM.

[29]  Mohan S. Kankanhalli,et al.  MMALFM , 2018, ACM Trans. Inf. Syst..

[30]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[31]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[32]  Jun Guo,et al.  Aspect-based latent factor model by integrating ratings and reviews for recommender system , 2016, Knowl. Based Syst..

[33]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[34]  Derek Bridge,et al.  Recommending from Experience , 2017, FLAIRS.

[35]  Amy Nicole Langville,et al.  Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization , 2014, ArXiv.