Context-Aware Entity Disambiguation in Text Using Markov Chains

In recent years, the amount of entities in large knowledge bases has been increasing rapidly. Such entities can help to bridge unstructured text with structured knowledge and thus be beneficial for many entity-centric applications. The key issue is to link entity mentions in text with entities in knowledge bases, where the main challenge lies in mention ambiguity. Many methods have been proposed to tackle this problem. However, most of the methods assume certain characteristics of the input mentions and documents, e.g., only named entities are considered. In this paper, we propose a context-aware approach to collective entity disambiguation of the input mentions in text with different characteristics in a consistent manner. We extensively evaluate the performance of our approach over 9 datasets and compare it with 14 state-of-the-art methods. Experimental results show that our approach outperforms the existing methods in most cases.

[1]  Achim Rettinger,et al.  Bridging the Gap between Cross-lingual NLP and DBpedia by Exploiting Wikipedia , 2015 .

[2]  Roberto Navigli A Quick Tour of Word Sense Disambiguation, Induction and Related Approaches , 2012, SOFSEM.

[3]  Sören Auer,et al.  AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data , 2014, International Semantic Web Conference.

[4]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[5]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[6]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[7]  Raphaël Troncy,et al.  Learning with the Web: Spotting Named Entities on the Intersection of NERD and Machine Learning , 2013, #MSM.

[8]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[9]  Harald Sack,et al.  Semantic Multimedia Information Retrieval Based on Contextual Descriptions , 2013, ESWC.

[10]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[11]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[12]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[13]  Sebastian Hellmann,et al.  N³ - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format , 2014, LREC.

[14]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[15]  Wei Shen,et al.  LINDEN: linking named entities with knowledge base via semantic knowledge , 2012, WWW.

[16]  Salvatore Orlando,et al.  Dexter: an open source framework for entity linking , 2013, ESAIR '13.

[17]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[18]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[19]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[20]  Raphaël Troncy,et al.  Benchmarking the Extraction and Disambiguation of Named Entities on the Semantic Web , 2014, LREC.

[21]  Raphaël Troncy,et al.  GERBIL: General Entity Annotator Benchmarking Framework , 2015, WWW.

[22]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[23]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[24]  Paolo Ferragina,et al.  Fast and Accurate Annotation of Short Texts with Wikipedia Pages , 2010, IEEE Software.

[25]  Dan Roth,et al.  Relational Inference for Wikification , 2013, EMNLP.

[26]  Paolo Ferragina,et al.  From TagME to WAT: a new entity annotator , 2014, ERD '14.

[27]  Achim Rettinger,et al.  xLiD-Lexica: Cross-lingual Linked Data Lexica , 2014, LREC.

[28]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[29]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[30]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[31]  Padhraic Smyth,et al.  Algorithms for estimating relative importance in networks , 2003, KDD '03.

[32]  P. Bonacich Factoring and weighting approaches to status scores and clique identification , 1972 .

[33]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.