Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation

Name ambiguity problem has raised urgent demands for efficient, high-quality named entity disambiguation methods. In recent years, the increasing availability of large-scale, rich semantic knowledge sources (such as Wikipedia and WordNet) creates new opportunities to enhance the named entity disambiguation by developing algorithms which can exploit these knowledge sources at best. The problem is that these knowledge sources are heterogeneous and most of the semantic knowledge within them is embedded in complex structures, such as graphs and networks. This paper proposes a knowledge-based method, called Structural Semantic Relatedness (SSR), which can enhance the named entity disambiguation by capturing and leveraging the structural semantic knowledge in multiple knowledge sources. Empirical results show that, in comparison with the classical BOW based methods and social network based methods, our method can significantly improve the disambiguation performance by respectively 8.7% and 14.7%.

[1]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[2]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[3]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[4]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[5]  David Yarowsky,et al.  Unsupervised Personal Name Disambiguation , 2003, CoNLL.

[6]  José M. Castaño Computational Linguistics and Intelligent Text Processing , 2003, Lecture Notes in Computer Science.

[7]  Cheng Niu,et al.  Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction , 2004, ACL.

[8]  Eduard Hovy,et al.  Multi-Document Person Name Resolution , 2004 .

[9]  Edoardo M. Airoldi,et al.  A Network Analysis Model for Disambiguation of Names in Lists , 2005, Comput. Math. Organ. Theory.

[10]  Xiaojun Wan,et al.  Person resolution in person search results: WebHawk , 2005, CIKM '05.

[11]  Ted Pedersen,et al.  Name Discrimination by Clustering Similar Contexts , 2005, CICLing.

[12]  Bradley Malin,et al.  Unsupervised Name Disambiguation via Social Network Similarity , 2005 .

[13]  Andrew McCallum,et al.  Disambiguating Web appearances of people in a social network , 2005, WWW '05.

[14]  William W. Cohen,et al.  Contextual search and name disambiguation in email using graphs , 2006, SIGIR.

[15]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Ian H. Witten,et al.  Mining Domain-Specific Thesauri from Wikipedia: A Case Study , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[17]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[18]  Jan-Ming Ho,et al.  Web Appearance Disambiguation of Personal Names Based on Network Motif , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[19]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[20]  Ismailcem Budak Arpinar,et al.  Ontology-Driven Automatic Entity Disambiguation in Unstructured Text , 2006, SEMWEB.

[21]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[22]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[23]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[24]  Ying Gao,et al.  Name Disambiguation Using Web Connection , 2007 .

[25]  Ying Chen,et al.  Towards Robust Unsupervised Personal Name Disambiguation , 2007, EMNLP-CoNLL.

[26]  Julio Gonzalo,et al.  The SemEval-2007 WePS Evaluation: Establishing a benchmark for the Web People Search Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[27]  Ian H. Witten,et al.  Topic indexing with Wikipedia , 2008 .

[28]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[29]  Dmitri V. Kalashnikov,et al.  Towards breaking the quality curse.: a web-querying approach to web people search. , 2008, SIGIR '08.

[30]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[31]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2009, Information Retrieval.

[32]  Julio Gonzalo,et al.  WePS 2 Evaluation Campaign: Overview of the Web People Search Clustering Task , 2009 .

[33]  Xianpei Han,et al.  Named entity disambiguation by leveraging wikipedia semantic knowledge , 2009, CIKM.

[34]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .