Integrating Topic and Latent Factors for Scalable Personalized Review-based Rating Prediction

Personalized review-based rating prediction, a newly emerged research problem, aims at inferring users' ratings over their unrated items using existing reviews and corresponding ratings. While some researchers proposed to learn topic factor from review text to obtain interpretability for rating prediction, they often overlooked the fact that the learned topic factors are limited to review text and cannot fully reveal the complicated relations between reviews and ratings. Moreover, topic modeling based solutions for this problem usually utilize Gibbs sampling algorithms to learn topics and word distributions, resulting in non-negligible computational overload. To address the above challenges, we propose an integrated topic and latent factor model (ITLFM), which combines topic and latent factors in a linear way to make them complement each other for better accuracies in rating prediction tasks. In addition, ITLFM models review text through an additive topic model to reveal user's and item's topic factors simultaneously. To ensure high learning efficiency, we design a hybrid stochastic learning algorithm for ITLFM. We evaluate ITLFM on several standard benchmarks and compare with representative approaches. The experimental results demonstrate that the proposed ITLFM method is computationally efficient and accurate, as well as scalable for large scale applications.

[1]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[2]  Yehuda Koren,et al.  Lessons from the Netflix prize challenge , 2007, SKDD.

[3]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[4]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[5]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[6]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[7]  Wai Lam,et al.  Collaborative Filtering Incorporating Review Text and Co-clusters of Hidden User Communities and Item Groups , 2014, CIKM.

[8]  Eric P. Xing,et al.  Sparse Additive Generative Models of Text , 2011, ICML.

[9]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[10]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[11]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Yue Lu,et al.  Latent aspect rating analysis without aspect keyword supervision , 2011, KDD.

[13]  Guokun Lai,et al.  Explicit factor models for explainable recommendation based on phrase-level sentiment analysis , 2014, SIGIR.

[14]  Aaron C. Courville,et al.  Learning Distributed Representations from Reviews for Collaborative Filtering , 2015, RecSys.

[15]  Deepak Agarwal,et al.  fLDA: matrix factorization through latent dirichlet allocation , 2010, WSDM '10.

[16]  Jure Leskovec,et al.  From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , 2013, WWW.

[17]  Jie Zhang,et al.  TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation , 2014, AAAI.

[18]  Yan Liu,et al.  Collaborative Topic Regression with Social Matrix Factorization for Recommendation Systems , 2012, ICML.

[19]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[20]  Wu-Jun Li,et al.  Collaborative Topic Regression with Social Regularization for Tag Recommendation , 2013, IJCAI.

[21]  N. Latha,et al.  Personalized Recommendation Combining User Interest and Social Circle , 2015 .

[22]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[23]  Bo Zhang,et al.  Sparse online topic models , 2013, WWW.

[24]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[25]  Yee Whye Teh,et al.  On Smoothing and Inference for Topic Models , 2009, UAI.

[26]  Lars Schmidt-Thieme,et al.  Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[27]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[28]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[29]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[30]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[31]  Sheng Wang,et al.  SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis , 2014, AAAI.

[32]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[33]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[34]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[35]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[36]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[37]  James R. Foulds,et al.  Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation , 2013, KDD.

[38]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[39]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[40]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[41]  Wei Zhang,et al.  Combining latent factor model with location features for event-based group recommendation , 2013, KDD.

[42]  Wei Zhang,et al.  Location and Time Aware Social Collaborative Retrieval for New Successive Point-of-Interest Recommendation , 2015, CIKM.