Troubleshooting and Optimizing Named Entity Resolution Systems in the Industry

Named Entity Resolution NER is an information extraction task that involves detecting mentions of named entities within texts and mapping them to their corresponding entities in a given knowledge resource. Systems and frameworks for performing NER have been developed both by the academia and the industry with different features and capabilities. Nevertheless, what all approaches have in common is that their satisfactory performance in a given scenario does not constitute a trustworthy predictor of their performance in a different one, the reason being the scenario's different characteristics target entities, input texts, domain knowledge etc.. With that in mind, in this paper we describe a metric-based Diagnostic Framework that can be used to identify the causes behind the low performance of NER systems in industrial settings and take appropriate actions to increase it.

[1]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[2]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[3]  Sören Auer,et al.  AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data , 2014, International Semantic Web Conference.

[4]  Johan Bos,et al.  A Survey of Computational Semantics: Representation, Inference and Knowledge in Wide-Coverage Text Understanding , 2011, Lang. Linguistics Compass.

[5]  Aldo Gangemi,et al.  A Comparison of Knowledge Extraction Tools for the Semantic Web , 2013, ESWC.

[6]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[7]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[8]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[9]  Ismailcem Budak Arpinar,et al.  Ontology-Driven Automatic Entity Disambiguation in Unstructured Text , 2006, SEMWEB.

[10]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[11]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[12]  Andreas Abecker,et al.  Entity Reference Resolution via Spreading Activation on RDF-Graphs , 2010, ESWC.

[13]  Raphaël Troncy,et al.  POLITECNICO DI TORINO Repository ISTITUZIONALE NERD : A Framework for Evaluating Named Entity Recognition Tools in the Web of Data / , 2022 .

[14]  Peter Adolphs,et al.  The neofonie NERD system at the ERD challenge 2014 , 2014, ERD '14.

[15]  Boris Villazón-Terrazas,et al.  Knowledge Tagger: Customizable Semantic Entity Resolution using Ontological Evidence , 2013, I-SEMANTICS.