An empirical study on user-topic rating based collaborative filtering methods

User based collaborative filtering (CF) has been successfully applied into recommender system for years. The main idea of user based CF is to discover communities of users sharing similar interests, thus, in which, the measurement of user similarity is the foundation of CF. However, existing user based CF methods suffer from data sparsity, which means the user-item matrix is often too sparse to get ideal outcome in recommender systems. One possible way to alleviate this problem is to bring new data sources into user based CF. Thanks to the rapid development of social annotation systems, we turn to using tags as new sources. In these approaches, user-topic rating based CF is proposed to extract topics from tags using different topic model methods, based on which we compute the similarities between users by measuring their preferences on topics. In this paper, we conduct comparisons between three user-topic rating based CF methods, using PLSA, Hierarchical Clustering and LDA. All these three methods calculate user-topic preferences according to their ratings of items and topic weights. We conduct the experiments using the MovieLens dataset. The experimental results show that LDA based user-topic rating CF and Hierarchical Clustering outperforms the traditional user based CF in recommending accuracy, while the PLSA based user-topic rating CF performs worse than the traditional user based CF.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[3]  Weiqing Wang,et al.  Comparing Collaborative Filtering Methods Based on User-Topic Ratings , 2013, SEKE.

[4]  Nadia Magnenat-Thalmann,et al.  Who, where, when and what: discover spatio-temporal topics for twitter users , 2013, KDD.

[5]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[6]  Weiqing Wang,et al.  User-based collaborative filtering on cross domain by tag transfer learning , 2012, CDKD '12.

[7]  Christian Wartena,et al.  Using Tag Co-occurrence for Recommendation , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[8]  Alexandros Nanopoulos,et al.  Social tagging in recommender systems: a survey of the state-of-the-art and possible extensions , 2010, Artificial Intelligence Review.

[9]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[10]  Jia Liu,et al.  Using inferred tag ratings to improve user-based collaborative filtering , 2012, SAC '12.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[13]  Ralf Krestel,et al.  Language Models and Topic Models for Personalizing Tag Recommendation , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[14]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[15]  Bamshad Mobasher,et al.  Personalized recommendation in social tagging systems using hierarchical clustering , 2008, RecSys '08.

[16]  Maosong Sun,et al.  Tag-LDA for Scalable Real-time Tag Recommendation , 2009 .

[17]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[18]  Zi Huang,et al.  Joint Modeling of Users' Interests and Mobility Patterns for Point-of-Interest Recommendation , 2015, ACM Multimedia.

[19]  Hua Lu,et al.  A unified model for stable and temporal topic detection from social media data , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[20]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  D. Ariely,et al.  Constructing Stable Preferences: A Look Into Dimensions of Experience and Their Impact on Preference Stability , 1999 .

[22]  F. Corpet Multiple sequence alignment with hierarchical clustering. , 1988, Nucleic acids research.