Tag-Based User Interest Discovery Though Keywords Extraction in Social Network

We consider the problem of exploiting to discover user interests from social network. User tags in social networks convey abundant implications of user interests,which great benefit various tasks ranging from user profile construction to user similarity calculation based recommendation. However,user interests extraction from social tags suffer from large diversity of word choices due to different user preference,especially the words that quite specific in minority knowledge domains. In addition,the deficiency of uniform concept hierarchy and lack of explicit semantic association between tags obscure the real interests of users. To obtain user interests from tags,we propose a tag normalization algorithm based on world knowledge to underpin the construction of common tags as well as the organization of user hierarchy interest. Experiments with Sina Micro-blog (http://weibo.com/) show that our algorithm can infer user’s interests better than traditional method based on contents.

[1]  Qing Yang,et al.  Discovering User Interest on Twitter with a Modified Author-Topic Model , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[2]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[3]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[4]  Jun Wang,et al.  An improved method of keywords extraction based on short technology text , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[5]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6]  Weiming Wu,et al.  The Application of Lucene in Information Leakage Monitoring and Querying System , 2010, 2010 2nd International Conference on Information Engineering and Computer Science.

[7]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[8]  Gan Jian-feng Design and implementation of web search engine based on Lucene , 2007 .

[9]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[10]  Wentian Li,et al.  Random texts exhibit Zipf's-law-like word frequency distribution , 1992, IEEE Trans. Inf. Theory.

[11]  Zhiyuan Liu,et al.  Mining the interests of Chinese microbloggers via keyword extraction , 2012, Frontiers of Computer Science.

[12]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[13]  Peter Nijkamp,et al.  Accessibility of Cities in the Digital Economy , 2004, cond-mat/0412004.

[14]  Philip K. Chan,et al.  Learning implicit user interest hierarchy for context in personalization , 2003, IUI.

[15]  Kai Niu,et al.  Microblog User Interest Modeling Based on Feature Propagation , 2013, 2013 Sixth International Symposium on Computational Intelligence and Design.

[16]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Ming Li,et al.  Topic Extraction Based on Knowledge Cluster in the Field of Micro-blog , 2014, ICIC.