Collective Named Entity Disambiguation using Graph Ranking and Clique Partitioning Approaches

Disambiguating named entities (NE) in running text to their correct interpretations in a specific knowledge base (KB) is an important problem in NLP. This paper presents two collective disambiguation approaches using a graph representation where possible KB candidates for NE textual mentions are represented as nodes and the coherence relations between different NE candidates are represented by edges. Each node has a local confidence score and each edge has a weight. The first approach uses Page-Rank (PR) to rank all nodes and selects a candidate based on PR score combined with local confidence score. The second approach uses an adapted Clique Partitioning technique to find the most weighted clique and expands this clique until all NE textual mentions are disambiguated. Experiments on 27,819 NE textual mentions show the effectiveness of both approaches, outperforming both baseline and state-of-the-art approaches.

[1]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[2]  Jing Jiang,et al.  Linking Entities to a Knowledge Base with Query Expansion , 2011, EMNLP.

[3]  Kamel Nebhi,et al.  Named Entity Disambiguation using Freebase and Syntactic Parsing , 2013, LD4IE@ISWC.

[4]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[5]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[6]  Andrés Montoyo,et al.  A graph-Based Approach to WSD Using Relevant Semantic Trees and N-Cliques Model , 2012, CICLing.

[7]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[8]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[9]  Eneko Agirre,et al.  Using the Multilingual Central Repository for Graph-Based Word Sense Disambiguation , 2008, LREC.

[10]  Jian Su,et al.  Entity Linking with Effective Acronym Expansion, Instance Selection, and Topic Modeling , 2011, IJCAI.

[11]  G. Prasad LEARNING TO LINK ENTITIES WITH KNOWLEDGE BASE , 2016 .

[12]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[13]  Rada Mihalcea,et al.  Unsupervised Graph-basedWord Sense Disambiguation Using Measures of Word Semantic Similarity , 2007, International Conference on Semantic Computing (ICSC 2007).

[14]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[15]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[16]  R. Luce,et al.  A method of matrix analysis of group structure , 1949, Psychometrika.

[17]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[18]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.

[19]  Robert J. Gaizauskas,et al.  Named Entity Based Document Similarity with SVM-Based Re-ranking for Entity Linking , 2012, AMLTA.

[20]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[21]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[22]  Wenpu Xing,et al.  Weighted PageRank algorithm , 2004, Proceedings. Second Annual Conference on Communication Networks and Services Research, 2004..

[23]  Robert J. Gaizauskas,et al.  Named Entity Disambiguation Using HMMs , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[24]  Andrés Montoyo,et al.  Word Sense Disambiguation: A Graph-Based Approach Using N-Cliques Partitioning Technique , 2011, NLDB.

[25]  Prithviraj Sen,et al.  Collective context-aware topic models for entity disambiguation , 2012, WWW.

[26]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[27]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[28]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.