A Simple Word Trigger Method for Social Tag Suggestion

It is popular for users in Web 2.0 era to freely annotate online resources with tags. To ease the annotation process, it has been great interest in automatic tag suggestion. We propose a method to suggest tags according to the text description of a resource. By considering both the description and tags of a given resource as summaries to the resource written in two languages, we adopt word alignment models in statistical machine translation to bridge their vocabulary gap. Based on the translation probabilities between the words in descriptions and the tags estimated on a large set of description-tags pairs, we build a word trigger method (WTM) to suggest tags according to the words in a resource description. Experiments on real world datasets show that WTM is effective and robust compared with other methods. Moreover, WTM is relatively simple and efficient, which is practical for Web applications.

[1]  Chris Quirk,et al.  Monolingual Machine Translation for Paraphrase Generation , 2004, EMNLP.

[2]  Satoshi Nakamura,et al.  Can social bookmarking enhance search in the web? , 2007, JCDL '07.

[3]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[4]  Tommaso Di Noia,et al.  Semantic tags generation and retrieval for online advertising , 2010, CIKM.

[5]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[6]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Mihai Surdeanu,et al.  Learning to Rank Answers on Large Online QA Collections , 2008, ACL.

[9]  Ralf Krestel,et al.  Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.

[10]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[11]  John D. Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR '99.

[12]  Naonori Ueda,et al.  Modeling Social Annotation Data with Content Relevance using a Topic Model , 2009, NIPS.

[13]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[14]  Maosong Sun,et al.  Tag-LDA for Scalable Real-time Tag Recommendation , 2009 .

[15]  Thierry Bertin-Mahieux,et al.  Automatic Generation of Social Tags for Music Recommendation , 2007, NIPS.

[16]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[17]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[18]  W. Bruce Croft,et al.  Retrieval models for question and answer archives , 2008, SIGIR '08.

[19]  Yi Liu,et al.  Query Rewriting Using Monolingual Statistical Machine Translation , 2010, CL.

[20]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[21]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[22]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[23]  Yi-fang Brook Wu,et al.  Domain-specific keyphrase extraction , 2005, CIKM '05.

[24]  Yi Liu,et al.  Translating Queries into Snippets for Improved Query Expansion , 2008, COLING.

[25]  Andrei Z. Broder,et al.  Automatic generation of bid phrases for online advertising , 2010, WSDM '10.

[26]  Haifeng Wang,et al.  Leveraging Multiple MT Engines for Paraphrase Generation , 2010, COLING.

[27]  Ravi Kumar,et al.  A translation model for matching reviews to objects , 2009, CIKM.

[28]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[29]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[30]  Michael R. Lyu,et al.  UserRec: A User Recommendation Framework in Social Tagging Systems , 2010, AAAI.

[31]  Yang Liu,et al.  Discriminative Word Alignment by Linear Modeling , 2010, CL.

[32]  Peter D. Turney Learning Algorithms for Keyphrase Extraction , 2000, Information Retrieval.

[33]  Andreas Hotho,et al.  Trend Detection in Folksonomies , 2006, SAMT.

[34]  Hans-Peter Kriegel,et al.  Hierarchical Bayesian Models for Collaborative Tagging Systems , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[35]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[36]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[37]  Hua Wu,et al.  Improving Statistical Machine Translation with Monolingual Collocation , 2010, ACL.

[38]  Zhiyuan Liu,et al.  Modeling Social Annotations via Latent Reason Identification , 2010, IEEE Intelligent Systems.

[39]  Christian Bauckhage,et al.  I tag, you tag: translating tags for advanced user models , 2010, WSDM '10.

[40]  Andy Hon Wai Chun,et al.  Automatic tag recommendation for the web 2.0 blogosphere using collaborative tagging and hybrid ANN semantic structures , 2007 .

[41]  Zhiyuan Liu,et al.  Automatic Keyphrase Extraction via Topic Decomposition , 2010, EMNLP.

[42]  Grigorios Tsoumakas,et al.  Multilabel Text Classification for Automated Tag Suggestion , 2008 .

[43]  Gilad Mishne,et al.  AutoTag: a collaborative approach to automated tag assignment for weblog posts , 2006, WWW '06.

[44]  Daniel Marcu,et al.  A Noisy-Channel Approach to Question Answering , 2003, ACL.

[45]  Andreas Hotho,et al.  Tag recommendations in social bookmarking systems , 2008, AI Commun..

[46]  Hua Wu,et al.  Collocation Extraction Using Monolingual Word Alignment Method , 2009, EMNLP.

[47]  W. Bruce Croft,et al.  Simple Translation Models for Sentence Retrieval in Factoid Question Answering , 2004 .

[48]  Hector Garcia-Molina,et al.  Social tag prediction , 2008, SIGIR '08.

[49]  Hiroshi Nakagawa,et al.  Browsing System for Weblog Articles based on Automated Folksonomy , 2006 .

[50]  Zhiyuan Liu,et al.  Automatic Keyphrase Extraction by Bridging Vocabulary Gap , 2011, CoNLL.

[51]  Eric Brill,et al.  Automatic question answering using the web: Beyond the Factoid , 2006, Information Retrieval.

[52]  Wenji Mao,et al.  Social Computing: From Social Informatics to Social Intelligence , 2007, IEEE Intell. Syst..

[53]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[54]  Yi Liu,et al.  Statistical Machine Translation for Query Expansion in Answer Retrieval , 2007, ACL.

[55]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[56]  ChengXiang Zhai,et al.  Estimation of statistical translation models based on mutual information for ad hoc information retrieval , 2010, SIGIR.

[57]  Zhiyuan Liu,et al.  Clustering to Find Exemplar Terms for Keyphrase Extraction , 2009, EMNLP.

[58]  Haifeng Wang,et al.  Paraphrasing with Search Engine Query Logs , 2010, COLING.