论文信息 - TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) - 字舞流文

TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)

We designed and implemented TAGME, a system that is able to efficiently and judiciously augment a plain-text with pertinent hyperlinks to Wikipedia pages. The specialty of TAGME with respect to known systems [5,8] is that it may annotate texts which are short and poorly composed, such as snippets of search-engine results, tweets, news, etc.. This annotation is extremely informative, so any task that is currently addressed using the bag-of-words paradigm could benefit from using this annotation to draw upon (the millions of) Wikipedia pages and their inter-relations.

Paolo Ferragina | Ugo Scaiella | P. Ferragina | Ugo Scaiella

[1] Nan Sun,et al. Exploiting internal and external semantics for the clustering of short texts using world knowledge , 2009, CIKM.

[2] Evgeniy Gabrilovich,et al. Feature Generation for Text Categorization Using World Knowledge , 2005, IJCAI.

[3] Mehran Sahami,et al. A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[4] Rada Mihalcea,et al. Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[5] Dawid Weiss,et al. A survey of Web clustering engines , 2009, CSUR.

[6] Paolo Ferragina,et al. A personalized search engine based on Web‐snippet hierarchical clustering , 2005, WWW '05.

[7] Silviu Cucerzan,et al. Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[8] Evgeniy Gabrilovich,et al. Wikipedia-based Semantic Interpretation for Natural Language Processing , 2014, J. Artif. Intell. Res..

[9] Ian H. Witten,et al. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[10] Ganesh Ramakrishnan,et al. Collective annotation of Wikipedia entities in web text , 2009, KDD.

[11] Chaomei Chen,et al. Mining the Web: Discovering knowledge from hypertext data , 2004, J. Assoc. Inf. Sci. Technol..

[12] Somnath Banerjee,et al. Clustering short texts using wikipedia , 2007, SIGIR.

[13] Stanislaw Osinski. Improving Quality of Search Results Clustering with Approximate Matrix Factorisations , 2006, ECIR.

[14] Gerhard Weikum,et al. WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[15] Ian H. Witten,et al. Mining Meaning from Wikipedia , 2008, Int. J. Hum. Comput. Stud..

[16] Hua Li,et al. Enhancing text clustering by leveraging Wikipedia semantics , 2008, SIGIR '08.

[17] Ramanathan V. Guha,et al. TAP: A Semantic Web Test-bed , 2003, J. Web Semant..

[18] Yin Yang,et al. Query by document , 2009, WSDM '09.

[19] Lyle H. Ungar,et al. Web-scale named entity recognition , 2008, CIKM '08.

[20] Giuseppe Attardi,et al. Semantically Annotated Snapshot of the English Wikipedia , 2008, LREC.

[21] Ian H. Witten,et al. Learning to link with wikipedia , 2008, CIKM '08.

[22] Daniel S. Weld,et al. Autonomously semantifying wikipedia , 2007, CIKM '07.