Collaborative Multi-Level Embedding Learning from Reviews for Rating Prediction

We investigate the problem of personalized review-based rating prediction which aims at predicting users' ratings for items that they have not evaluated by using their historical reviews and ratings. Most of existing methods solve this problem by integrating topic model and latent factor model to learn interpretable user and items factors. However, these methods cannot utilize word local context information of reviews. Moreover, it simply restricts user and item representations equivalent to their review representations, which may bring some irrelevant information in review text and harm the accuracy of rating prediction. In this paper, we propose a novel Collaborative Multi-Level Embedding (CMLE) model to address these limitations. The main technical contribution of CMLE is to integrate word embedding model with standard matrix factorization model through a projection level. This allows CMLE to inherit the ability of capturing word local context information from word embedding model and relax the strict equivalence requirement by projecting review embedding to user and item embeddings. A joint optimization problem is formulated and solved through an efficient stochastic gradient ascent algorithm. Empirical evaluations on real datasets show CMLE outperforms several competitive methods and can solve the two limitations well.

[1]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[2]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[3]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[4]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[7]  Yehuda Koren,et al.  Lessons from the Netflix prize challenge , 2007, SKDD.

[8]  Graeme Hirst,et al.  Synthesis Lectures on Human Language Technologies , 2009 .

[9]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[10]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[11]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[12]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[13]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[14]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Sheng Wang,et al.  SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis , 2014, AAAI.

[17]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[18]  Guokun Lai,et al.  Explicit factor models for explainable recommendation based on phrase-level sentiment analysis , 2014, SIGIR.

[19]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[20]  Jie Zhang,et al.  TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation , 2014, AAAI.

[21]  Ting Liu,et al.  Learning Semantic Representations of Users and Products for Document Level Sentiment Classification , 2015, ACL.

[22]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[23]  Jiawei Han,et al.  Mining Quality Phrases from Massive Text Corpora , 2015, SIGMOD Conference.

[24]  Wei Zhang,et al.  Prior-Based Dual Additive Latent Dirichlet Allocation for User-Item Connected Documents , 2015, IJCAI.

[25]  Mark Dredze,et al.  Learning Composition Models for Phrase Embeddings , 2015, TACL.

[26]  Shujian Huang,et al.  A Synthetic Approach for Recommendation: Combining Ratings, Social Relations, and Reviews , 2015, IJCAI.

[27]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[28]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[29]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[30]  Andreas Mavridis,et al.  Matrix factorization techniques for recommender systems , 2017 .