Learning Topical Translation Model for Microblog Hashtag Suggestion

Hashtags can be viewed as an indication to the context of the tweet or as the core idea expressed in the tweet. They provide valuable information for many applications, such as information retrieval, opinion mining, text classification, and so on. However, only a small number of microblogs are manually tagged. To address this problem, in this work, we propose a topical translation model for microblog hashtag suggestion. We assume that the content and hashtags of the tweet are talking about the same themes but written in different languages. Under the assumption, hashtag suggestion is modeled as a translation process from content to hashtags. Moreover, in order to cover the topic of tweets, the proposed model regards the translation probability to be topic-specific. It uses topic-specific word trigger to bridge the vocabulary gap between the words in tweets and hashtags, and discovers the topics of tweets by a topic model designed for microblogs. Experimental results on the dataset crawled from real world microblogging service demonstrate that the proposed method outperforms state-of-the-art methods.

[1]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[2]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[5]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Gilad Mishne,et al.  AutoTag: a collaborative approach to automated tag assignment for weblog posts , 2006, WWW '06.

[7]  Hiroshi Nakagawa,et al.  Browsing System for Weblog Articles based on Automated Folksonomy , 2006 .

[8]  Blogosonomy : Autotagging Any Text Using Bloggers' Knowledge , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[9]  Andy Hon Wai Chun,et al.  Automatic tag recommendation for the web 2.0 blogosphere using collaborative tagging and hybrid ANN semantic structures , 2007 .

[10]  Hector Garcia-Molina,et al.  Social tag prediction , 2008, SIGIR '08.

[11]  Andreas Hotho,et al.  Tag recommendations in social bookmarking systems , 2008, AI Commun..

[12]  Iryna Gurevych,et al.  Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding , 2009, ACL.

[13]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[14]  Ralf Krestel,et al.  Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.

[15]  Naonori Ueda,et al.  Modeling Social Annotation Data with Content Relevance using a Topic Model , 2009, NIPS.

[16]  Miles Efron,et al.  Hashtag retrieval in a microblogging environment , 2010, SIGIR.

[17]  Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval , 2010 .

[18]  Ari Rappoport,et al.  Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[19]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[20]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[21]  Prasenjit Majumder,et al.  Query Expansion for Microblog Retrieval , 2011, TREC.

[22]  Mengen Chen,et al.  Short Text Classification Improved by Learning Multi-Granularity Topics , 2011, IJCAI.

[23]  Yang Song,et al.  Topical Keyphrase Extraction from Twitter , 2011, ACL.

[24]  Zhiyuan Liu,et al.  A Simple Word Trigger Method for Social Tag Suggestion , 2011, EMNLP.

[25]  Li Cai,et al.  Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives , 2011, ACL.

[26]  Zhiyuan Liu,et al.  Topical Word Trigger Model for Keyphrase Extraction , 2012, COLING.

[27]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[28]  Xuanjing Huang,et al.  Automatic Hashtag Recommendation for Microblogs using Topic-Specific Translation Model , 2012, COLING.