Adapting Semantic Spreading Activation to Entity Linking in Text

The extraction and the disambiguation of knowledge guided by textual resources on the web is a crucial process to advance the Web of Linked Data. The goal of our work is to semantically enrich raw data by linking the mentions of named entities in the text to the corresponding known entities in knowledge bases. In our approach multiple aspects are considered: the prior knowledge of an entity in Wikipedia (i.e. the keyphraseness and commonness features that can be precomputed by crawling the Wikipedia dump), a set of features extracted from the input text and from the knowledge base, along with the correlation/relevancy among the resources in Linked Data. More precisely, this work explores the collective ranking approach formalized as a weighted graph model, in which the mentions in the input text and the candidate entities from knowledge bases are linked using the local compatibility and the global relatedness measures. Experiments on the datasets of the Open Knowledge Extraction (OKE) challenge with different configurations of our approach in each phase of the linking pipeline reveal its optimum mode. We investigate the notion of semantic relatedness between two entities represented as sets of neighbours in Linked Open Data that relies on an associative retrieval algorithm, with consideration of common neighbourhood. This measure improves the performance of prior link-based models and outperforms the explicit inter-link relevancy measure among entities (mostly Wikipedia-centric). Thus, our approach is resilient to non-existent or sparse links among related entities.

[1]  Dan Roth,et al.  Relational Inference for Wikification , 2013, EMNLP.

[2]  Salvatore Orlando,et al.  Dexter: an open source framework for entity linking , 2013, ESAIR '13.

[3]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[4]  Jiawei Han,et al.  Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions , 2015, IEEE Transactions on Knowledge and Data Engineering.

[5]  Norberto Fernández García,et al.  IdentityRank: Named entity disambiguation in the news domain , 2012, Expert Syst. Appl..

[6]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[7]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[8]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[9]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[10]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[11]  Valentin I. Spitkovsky,et al.  A Cross-Lingual Dictionary for English Wikipedia Concepts , 2012, LREC.

[12]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[13]  Iryna Gurevych,et al.  Link Discovery: A Comprehensive Analysis , 2011, 2011 IEEE Fifth International Conference on Semantic Computing.

[14]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[15]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[16]  Fabien L. Gandon,et al.  Exploratory search on topics through different perspectives with DBpedia , 2014, SEM '14.

[17]  Laura Dietz,et al.  A neighborhood relevance model for entity linking , 2013, OAIR.

[18]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[19]  Jens Lehmann,et al.  Integrating NLP Using Linked Data , 2013, SEMWEB.

[20]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[21]  Fernando Pereira,et al.  Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links to Wikipedia , 2012 .

[22]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[23]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[24]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[25]  Norberto Fernández García,et al.  WebTLab: A cooccurrence-based approach to KBP 2010 Entity-Linking task , 2010, TAC.

[26]  Zhaochen Guo,et al.  Robust Entity Linking via Random Walks , 2014, CIKM.

[27]  Massimiliano Ciaramita,et al.  A framework for benchmarking entity-annotation systems , 2013, WWW.

[28]  Simone Paolo Ponzetto,et al.  Knowledge Derived From Wikipedia For Computing Semantic Relatedness , 2007, J. Artif. Intell. Res..

[29]  Raphaël Troncy,et al.  A Hybrid Approach for Entity Recognition and Linking , 2015, SemWebEval@ESWC.

[30]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[31]  Gerhard Weikum,et al.  KORE: keyphrase overlap relatedness for entity disambiguation , 2012, CIKM.

[32]  Salvatore Orlando,et al.  Learning relatedness measures for entity linking , 2013, CIKM.