Entity disambiguation to Wikipedia using collective ranking

We propose Feedback-query-expansion and Re-ranking methods which model the semantic relatedness of entities in one document.We demonstrate the effectiveness of our methods by comparing with the baseline systems on three data sets.Our team has scored in the top 3 teams across multiple metrics for the English EDL task in TAC2014. Entity disambiguation is a fundamental task of semantic Web annotation. Entity Linking (EL) is an essential procedure in entity disambiguation, which aims to link a mention appearing in a plain text to a structured or semi-structured knowledge base, such as Wikipedia. Existing research on EL usually annotates the mentions in a text one by one and treats entities independent to each other. However this might not be true in many application scenarios. For example, if two mentions appear in one text, they are likely to have certain intrinsic relationships. In this paper, we first propose a novel query expansion method for candidate generation utilizing the information of co-occurrences of mentions. We further propose a re-ranking model which can be iteratively adjusted based on the prediction in the previous round. Experiments on real-world data demonstrate the effectiveness of our proposed methods for entity disambiguation.

[1]  Heng Ji,et al.  Collaborative Ranking: A Case Study on Entity Linking , 2011, EMNLP.

[2]  Jian Su,et al.  A Lazy Learning Model for Entity Linking using Query-Specific Information , 2012, COLING.

[3]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[4]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[5]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[6]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[7]  Marie-Jean Meurs,et al.  Mutual Disambiguation for Entity Linking , 2014, ACL.

[8]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[9]  Leon Derczynski,et al.  USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding , 2011, TAC.

[10]  Luke S. Zettlemoyer,et al.  Joint Coreference Resolution and Named-Entity Linking with Multi-Pass Sieves , 2013, EMNLP.

[11]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[12]  Andrew McCallum,et al.  Disambiguating Web appearances of people in a social network , 2005, WWW '05.

[13]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.

[14]  Kuansan Wang,et al.  Entity linking at the tail: sparse signals, unknown entities, and phrase models , 2014, WSDM.

[15]  Joel Nothman,et al.  Evaluating Entity Linking with Wikipedia , 2013, Artif. Intell..

[16]  Ismailcem Budak Arpinar,et al.  Ontology-Driven Automatic Entity Disambiguation in Unstructured Text , 2006, SEMWEB.

[17]  Weigang Li,et al.  Entity Extraction within Plain-Text Collections WISE 2013 Challenge - T1: Entity Linking Track , 2013, WISE.

[18]  Dan Roth,et al.  Relational Inference for Wikification , 2013, EMNLP.

[19]  Cheng Li,et al.  Two supervised learning approaches for name disambiguation in author citations , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[20]  Silviu Cucerzan,et al.  TAC Entity Linking by Performing Full-document Entity Extraction and Disambiguation , 2011, TAC.

[21]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[22]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[23]  Felix Naumann,et al.  BEL: Bagging for Entity Linking , 2014, COLING.

[24]  Jian Su,et al.  NUS-I2R: Learning a Combined System for Entity Linking , 2010, TAC.

[25]  Wei Shen,et al.  LINDEN: linking named entities with knowledge base via semantic knowledge , 2012, WWW.

[26]  Yang Tang,et al.  THU QUANTA at TAC 2009 KBP and RTE Track , 2009, TAC.

[27]  Avirup Sil,et al.  Re-ranking for joint named-entity recognition and linking , 2013, CIKM.

[28]  Massimiliano Ciaramita,et al.  A Scalable Gibbs Sampler for Probabilistic Entity Linking , 2014, ECIR.

[29]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[30]  Ben Hachey,et al.  Overview of TAC-KBP2014 Entity Discovery and Linking Tasks , 2015 .

[31]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[32]  C. Lee Giles,et al.  Efficient Name Disambiguation for Large-Scale Databases , 2006, PKDD.