论文信息 - NTUNLP Approaches to Recognizing and Disambiguating Entities in Long and Short Text in the 2014 ERD Challenge

NTUNLP Approaches to Recognizing and Disambiguating Entities in Long and Short Text in the 2014 ERD Challenge

paper presents the NTUNLP systems in the long track and the short track of the Entity Recognition and Disambiguation Challenge 2014. We first create a dictionary that contains the possible surface forms of Freebase Ids, then scan the given text from left to right with the longest match strategy to detect the mentions, and eliminate the unwanted surface forms based on a stop word list. Methods to link to the most relevant entities and select the best candidate are proposed for these two tracks, respectively. The outside resources such as DBpedia Spotlight and TAGME are integrated to our basic NTUNLP systems. Various experimental setups are presented and discussed with the development set. In the formal run, one NTUNLP system wins the first prize in the short track and another NTUNLP system gets the fourth place in the long track.

[1] Razvan C. Bunescu,et al. Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[2] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.

[3] Alfred V. Aho,et al. Efficient string matching , 1975, Commun. ACM.

[4] Rada Mihalcea,et al. Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[5] Jens Lehmann,et al. DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[6] S. Soderland,et al. - based Named Entity Disambiguation to Arbitrary Web Text , 2009 .

[7] Silviu Cucerzan,et al. Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[8] Erdogan Dogdu,et al. Named entity recognition and disambiguation using linked data and graph-based centrality scoring , 2012, SWIM '12.

[9] Paolo Ferragina,et al. TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[10] Christian Bizer,et al. DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[11] Ian H. Witten,et al. Learning to link with wikipedia , 2008, CIKM '08.

[12] Paolo Ferragina,et al. Fast and Accurate Annotation of Short Texts with Wikipedia Pages , 2010, IEEE Software.

[13] Erdogan Dogdu,et al. Semantic question answering system over linked data using relational patterns , 2013, EDBT '13.