Detecting Keyphrases in Micro-blogging with Graph Modeling of Information Diffusion

The rapid increasing popularity of micro-blogging has made it an important information seeking channel. Keyphrase extraction is an effective way for summarizing and analyzing micro-blogging content, which can help users gain insights into internet hotspots. Existing methods for keyphrase extraction usually unilaterally consider phrase frequency or user retweet count as key factors. However, those methods may neglect the relationships between different phrases and the importance of user influence to further information diffusion. Generally, phrases shown in the influential users’ micro-blogs are more likely to attract other users’ interest, making them more likely to be diffused in the near future. Besides, phrases may have relations with each other, and some phrases usually have similar diffusion paths and attract the attention of the same population. In this paper, by comprehensively considering all the above mentioned factors to detect micro-blogging keyphrases, we proposed a novel model. The proposed model first detect high frequency term from abundant micro-blogs as candidate keyphrases, then construct a relation graph about them with user interest and user following web. Finally, we rank those candidates with graph models for realizing keyphrases detection. Experiments show this model is very effective for micro-blogging keyphrase extraction.

[1]  Mo Mansouri,et al.  City of New York on Twitter: @NYCGov , 2012, dg.o '12.

[2]  Xiaojun Wan,et al.  Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[3]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[4]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[5]  Wei Wu,et al.  Automatic Generation of Personalized Annotation Tags for Twitter Users , 2010, NAACL.

[6]  Philip S. Yu,et al.  Time Sensitive Ranking with Application to Publication Search , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[7]  Mitsuru Ishizuka,et al.  Extracting Keyphrases to Represent Relations in Social Networks from Web , 2007, IJCAI.

[8]  Zhiyuan Liu,et al.  Automatic Keyphrase Extraction via Topic Decomposition , 2010, EMNLP.

[9]  Jung-Tae Lee,et al.  Finding interesting posts in Twitter based on retweet graph analysis , 2012, SIGIR '12.

[10]  Timothy Baldwin,et al.  Cross-domain Feature Selection for Language Identification , 2011, IJCNLP.

[11]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[12]  Daniel Dajun Zeng,et al.  An Information Diffusion Based Recommendation Framework for Micro-Blogging , 2010, J. Assoc. Inf. Syst..

[13]  Ken Barker,et al.  Using Noun Phrase Heads to Extract Document Keyphrases , 2000, Canadian Conference on AI.

[14]  Patricia L. Mabry,et al.  Advances in Social Computing, Third International Conference on Social Computing, Behavioral Modeling, and Prediction, SBP 2010, Bethesda, MD, USA, March 30-31, 2010. Proceedings , 2010, SBP.

[15]  K. Selçuk Candan,et al.  How Does the Data Sampling Strategy Impact the Discovery of Information Diffusion in Social Media? , 2010, ICWSM.

[16]  Xuanjing Huang,et al.  Keyphrase Extraction from Online News Using Binary Integer Programming , 2011, IJCNLP.

[17]  Yang Song,et al.  Topical Keyphrase Extraction from Twitter , 2011, ACL.

[18]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[19]  Shuangyong Song,et al.  A Spatio-temporal Framework for Related Topic Search in Micro-Blogging , 2010, AMT.

[20]  Abdelghani Bellaachia,et al.  NE-Rank: A Novel Graph-Based Keyphrase Extraction in Twitter , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[21]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[22]  Sinan Aral,et al.  Productivity Effects of Information Diffusion in Networks , 2007 .

[23]  Hiroshi Nakagawa,et al.  A Simple but Powerful Automatic Term Extraction Method , 2002, COLING 2002.

[24]  Xiaolong Zheng,et al.  Detecting popular topics in micro-blogging based on a user interest-based model , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[25]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[26]  Richard Mitchell,et al.  A comparison of automated keyphrase extraction techniquesand of automatic evaluation vs. human evaluation , 2012 .

[27]  Fabio Celli,et al.  Social Network Data and Practices: The Case of Friendfeed , 2010, SBP.

[28]  Craig MacDonald,et al.  Voting techniques for expert search , 2008, Knowledge and Information Systems.

[29]  Timo Honkela,et al.  A Language-Independent Approach to Keyphrase Extraction and Evaluation , 2008, COLING.