Recently, Entity Linking and Retrieval turned out to be one of the most interesting tasks in Information Extraction due to its various applications. Entity Linking (EL) is the task of detecting mentioned entities in a text and linking them to the corresponding entries of a Knowledge Base. EL is traditionally composed of three major parts: i)spotting, ii)candidate generation, and iii)candidate disambiguation. The performance of an EL system is highly dependent on the accuracy of each individual part. In this paper, we focus on these three main building blocks of EL systems and try to improve on the results of one of the open source EL systems, namely DBpedia Spotlight. We propose to use text pre-processing and parameter tuning to "focus" a general-purpose EL system to perform better on different kinds of input text. Also, one of the main drawbacks of EL systems is identifying where a name does not refer to any known entity. To improve this so-called NIL-detection, we define different features using a set of texts and their known entities and design a classifier to automatically classify DBpedia Spotlight's output entities as "NIL" or "Not NIL". The proposed system has participated in the SIGIR ERD Challenge 2014 and the performance analysis of this system on the challenge's datasets shows that the proposed approaches successfully improve the accuracy of the baseline system.
[1]
Christian Bizer,et al.
DBpedia spotlight: shedding light on the web of documents
,
2011,
I-Semantics '11.
[2]
Felix Sasaki,et al.
Evaluating the Impact of Phrase Recognition on Concept Tagging
,
2012,
LREC.
[3]
Tru H. Cao,et al.
Named Entity Disambiguation: a Hybrid Approach
,
2012,
Int. J. Comput. Intell. Syst..
[4]
Corinna Cortes,et al.
Support-Vector Networks
,
1995,
Machine Learning.
[5]
Richard Tzong-Han Tsai,et al.
From Entity Recognition to Entity Linking: A Survey of Advanced Entity Linking Techniques (人工知能学会全国大会(第26回)文化,科学技術と未来) -- (International Organized Session「Special Session on Web Intelligence & Data Mining」)
,
2012
.
[6]
Mark Dredze,et al.
Entity Linking: Finding Extracted Entities in a Knowledge Base
,
2013,
Multi-source, Multilingual Information Extraction and Summarization.
[7]
Pablo N. Mendes,et al.
Improving efficiency and accuracy in multilingual entity extraction
,
2013,
I-SEMANTICS '13.