Linking Entities in Unstructured Texts with RDF Knowledge Bases

Entity linking (entity annotation) is the task of linking named entity mentions on Web pages with the entities of a knowledge base (KB). With the continued progress of information extraction and semantic search techniques, entity linking has received much attention in both research and industrial communities. The challenge of the task is mainly on entity disambiguation. To our best knowledge, the huge existing RDF KBs have not been fully exploited for entity linking. In this paper, we study the entity linking problem via the usage of RDF KBs. Besides the accuracy of entity linking, the scalability of handling huge Web corpus and large RDF KBs are also studied. The experimental results show that our solution on entity linking achieves not only very good accuracy but also good scalability.

[1]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[2]  Gianluca Demartini,et al.  ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking , 2012, WWW.

[3]  Ramanathan V. Guha,et al.  Semantic search , 2003, WWW '03.

[4]  Stan Matwin,et al.  Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian AI.

[5]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[8]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.

[9]  Paola Velardi,et al.  Structural semantic interconnections: a knowledge-based approach to word sense disambiguation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Gerhard Weikum,et al.  From information to knowledge: harvesting entities and relationships from web sources , 2010, PODS '10.

[11]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[12]  Wei Shen,et al.  LINDEN: linking named entities with knowledge base via semantic knowledge , 2012, WWW.

[13]  Tim Oates,et al.  A Context-Aware Approach to Entity Linking , 2012, AKBC-WEKEX@NAACL-HLT.

[14]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[15]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[16]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[17]  Carlo Strapparava,et al.  Domain Kernels for Word Sense Disambiguation , 2005, ACL.

[18]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[19]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[20]  Wei Shen,et al.  LIEGE:: link entities in web lists with knowledge base , 2012, KDD.

[21]  Hwee Tou Ng,et al.  Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[22]  Ralph Grishman,et al.  Discovering Relations among Named Entities from Large Corpora , 2004, ACL.