Personalized topic modeling for recommending user-generated content

User-generated content (UGC) such as blogs and twitters are exploding in modern Internet services. In such systems, recommender systems are needed to help people filter vast amount of UGC generated by other users. However, traditional rec-ommendation models do not use user authorship of items. In this paper, we show that with this additional information, we can significantly improve the performance of recommendations. A generative model that combines hierarchical topic modeling and matrix factorization is proposed. Empirical results show that our model outperforms other state-of-the-art models, and can provide interpretable topic structures for users and items. Furthermore, since user interests can be inferred from their productions, rec-ommendations can be made for users that do not have any ratings to solve the cold-start problem.

[1]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[2]  Yan Liu,et al.  Collaborative Topic Regression with Social Matrix Factorization for Recommendation Systems , 2012, ICML.

[3]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[4]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.

[5]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[6]  Yueshen Xu,et al.  Learning to Recommend with User Generated Content , 2015, WAIM.

[7]  Pasquale Lops,et al.  Content-based Recommender Systems: State of the Art and Trends , 2011, Recommender Systems Handbook.

[8]  Yueshen Xu,et al.  Collaborative recommendation with user generated content , 2015, Eng. Appl. Artif. Intell..

[9]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[10]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[11]  Yoram Singer,et al.  Efficient projections onto the {\it l}$_{\mbox{1}}$-ball for learning in high dimensions , 2008, ICML 2008.

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  Deepak Agarwal,et al.  fLDA: matrix factorization through latent dirichlet allocation , 2010, WSDM '10.

[14]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[15]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[16]  Toon De Pessemier,et al.  Context aware recommendations for user-generated content on a social network site , 2009, EuroITV '09.

[17]  Raymond J. Mooney,et al.  Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.

[18]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[19]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[20]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[21]  Qiang Yang,et al.  One-Class Collaborative Filtering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[22]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[23]  Ming Yang,et al.  Scientific articles recommendation with topic regression and relational matrix factorization , 2014, Journal of Zhejiang University SCIENCE C.

[24]  Diego Sona,et al.  Hierarchical Dirichlet model for document classification , 2005, ICML.