LINDEN: linking named entities with knowledge base via semantic knowledge

Integrating the extracted facts with an existing knowledge base has raised an urgent need to address the problem of entity linking. Specifically, entity linking is the task to link the entity mention in text with the corresponding real world entity in the existing knowledge base. However, this task is challenging due to name ambiguity, textual inconsistency, and lack of world knowledge in the knowledge base. Several methods have been proposed to tackle this problem, but they are largely based on the co-occurrence statistics of terms between the text around the entity mention and the document associated with the entity. In this paper, we propose LINDEN, a novel framework to link named entities in text with a knowledge base unifying Wikipedia and WordNet, by leveraging the rich semantic knowledge embedded in the Wikipedia and the taxonomy of the knowledge base. We extensively evaluate the performance of our proposed LINDEN over two public data sets and empirical results show that LINDEN significantly outperforms the state-of-the-art methods in terms of accuracy.

[1]  Ted Pedersen,et al.  Name Discrimination by Clustering Similar Contexts , 2005, CICLing.

[2]  Fabian M. Suchanek,et al.  Yago: A Core of Semantic Knowledge Unifying WordNet and Wikipedia , 2007 .

[3]  Max Mühlhäuser,et al.  Analyzing and accessing Wikipedia as a lexical semantic resource , 2007 .

[4]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[5]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.

[6]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[7]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[8]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[9]  Jianyong Wang,et al.  GRAPE: A Graph-Based Framework for Disambiguating People Appearances in Web Search , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[10]  William W. Cohen,et al.  Contextual search and name disambiguation in email using graphs , 2006, SIGIR.

[11]  C. Fellbaum An Electronic Lexical Database , 1998 .

[12]  Ralph Grishman,et al.  Discovering Relations among Named Entities from Large Corpora , 2004, ACL.

[13]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[14]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[15]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[16]  Bradley Malin,et al.  Unsupervised Name Disambiguation via Social Network Similarity , 2005 .

[17]  Edoardo M. Airoldi,et al.  A Network Analysis Model for Disambiguation of Names in Lists , 2005, Comput. Math. Organ. Theory.

[18]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[20]  Daniel S. Weld,et al.  Automatically refining the wikipedia infobox ontology , 2008, WWW.

[21]  Mirella Lapata,et al.  Graph Connectivity Measures for Unsupervised Word Sense Disambiguation , 2007, IJCAI.

[22]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[23]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[24]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[25]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[26]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.

[27]  David Yarowsky,et al.  Unsupervised Personal Name Disambiguation , 2003, CoNLL.

[28]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[29]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[30]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[31]  Paola Velardi,et al.  Structural semantic interconnections: a knowledge-based approach to word sense disambiguation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Andrew McCallum,et al.  Disambiguating Web appearances of people in a social network , 2005, WWW '05.

[33]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .