Addressing cold-start: Scalable recommendation with tags and keywords

Cold start problem for new users and new items is a major challenge facing most collaborative filtering systems. Existing methods to collaborative filtering (CF) emphasize to scale well up to large and sparse dataset, lacking of scalable approach to dealing with new data. In this paper, we consider a novel method for alleviating the problem by incorporating content-based information about users and items, i.e., tags and keywords. The user-item ratings imply the relevance of users' tags to items' keywords, so we convert the direct prediction on the user-item rating matrix into the indirect prediction on the tag-keyword relation matrix that adopts to the emergence of new data. We first propose a novel neighborhood approach for building the tag-keyword relation matrix based on the statistics of tag-keyword pairs in the ratings. Then, with the relation matrix, we propose a 3-factor matrix factorization model over the rating matrix, for learning every user's interest vector for selected tags and every item's correlation vector for extracted keywords. Finally, we integrate the relation matrix with the two kinds of vectors to make recommendations. Experiments on real dataset demonstrate that our method not only outperforms other state-of-the-art CF algorithms for historical data, but also has good scalability for new data.

[1]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[2]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[3]  Lars Schmidt-Thieme,et al.  Fast context-aware recommendations with factorization machines , 2011, SIGIR.

[4]  Mohammad Ali Abbasi,et al.  Trust-Aware Recommender Systems , 2014 .

[5]  Martin Ester,et al.  A matrix factorization technique with trust propagation for recommendation in social networks , 2010, RecSys '10.

[6]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[7]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[8]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[9]  Hong Shen,et al.  Two-Phase Layered Learning Recommendation via Category Structure , 2014, PAKDD.

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Gediminas Adomavicius,et al.  Context-aware recommender systems , 2008, RecSys '08.

[12]  Luo Si,et al.  An automatic weighting scheme for collaborative filtering , 2004, SIGIR '04.

[13]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[14]  Michael R. Lyu,et al.  Learning to recommend with social trust ensemble , 2009, SIGIR.

[15]  Alexander Tuzhilin,et al.  Using Context to Improve Predictive Modeling of Customers in Personalization Applications , 2008, IEEE Transactions on Knowledge and Data Engineering.

[16]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[17]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[18]  Bernd Ludwig,et al.  Matrix factorization techniques for context aware recommendation , 2011, RecSys '11.

[19]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[20]  Barry Smyth,et al.  Trust in recommender systems , 2005, IUI.

[21]  Michael R. Lyu,et al.  SoRec: social recommendation using probabilistic matrix factorization , 2008, CIKM '08.

[22]  Michael R. Lyu,et al.  Introduction to social recommendation , 2010, WWW '10.

[23]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[24]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.