An Approach to Collective Entity Linking

Entity linking is the task of disambiguating entities in unstructured text by linking them to an entity in a catalog. Several collective entity linking approaches exist that attempt to collectively disambiguate all mentions in the text by leveraging both local mention-entity context and global entity-entity relatedness. However, the complexity of these models makes it unfeasible to employ exact inference techniques and jointly train the local and global feature weights. In this work we present a collective disambiguation model, that, under suitable assumptions makes efficient implementation of exact MAP inference possible. We also present an efficient approach to train the local and global features of this model and implement it in an interactive entity linking system. The system receives human feedback on a document collection and progressively trains the underlying disambiguation model.

[1]  Oren Etzioni,et al.  Entity Linking at Web Scale , 2012, AKBC-WEKEX@NAACL-HLT.

[2]  Cong Yu,et al.  EntityEngine: answering entity-relationship queries using shallow semantics , 2010, CIKM '10.

[3]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[4]  Gerhard Weikum,et al.  NAGA: harvesting, searching and ranking knowledge , 2008, SIGMOD Conference.

[5]  Pararth Shah,et al.  System for collective entity disambiguation , 2014, ERD '14.

[6]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[7]  Massimiliano Ciaramita,et al.  A framework for benchmarking entity-annotation systems , 2013, WWW.

[8]  Andrew McCallum,et al.  An Integrated, Conditional Model of Information Extraction and Coreference with Appli , 2004, UAI.

[9]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[11]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[12]  Lan Nie,et al.  Resolving Surface Forms to Wikipedia Topics , 2010, COLING.

[13]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[14]  Rajeev Rastogi,et al.  Entity disambiguation with hierarchical topic models , 2011, KDD.

[15]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ramanathan V. Guha,et al.  SemTag and seeker: bootstrapping the semantic web via automated semantic annotation , 2003, WWW '03.

[17]  Michael Strube,et al.  Jointly Disambiguating and Clustering Concepts and Entities with Markov Logic , 2012, COLING.

[18]  Lise Getoor,et al.  A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.

[19]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[20]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[21]  Soumen Chakrabarti,et al.  Optimizing scoring functions and indexes for proximity search in type-annotated corpora , 2006, WWW '06.

[22]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[23]  Andrew McCallum,et al.  An Entity Based Model for Coreference Resolution , 2009, SDM.

[24]  Ben Taskar,et al.  Learning associative Markov networks , 2004, ICML.

[25]  Kevin Chen-Chuan Chang,et al.  EntityRank: Searching Entities Directly and Holistically , 2007, VLDB.

[26]  Ben Taskar,et al.  Online, self-supervised terrain classification via discriminatively trained submodular Markov random fields , 2008, 2008 IEEE International Conference on Robotics and Automation.

[27]  Martha Palmer,et al.  An Empirical Study of the Behavior of Active Learning for Word Sense Disambiguation , 2006, NAACL.

[28]  Hinrich Schütze,et al.  The SMAPH system for query entity recognition and disambiguation , 2014, ERD '14.

[29]  Xianpei Han,et al.  Named entity disambiguation by leveraging wikipedia semantic knowledge , 2009, CIKM.

[30]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[31]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[32]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[33]  Xianpei Han,et al.  An Entity-Topic Model for Entity Linking , 2012, EMNLP.

[34]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.