Joint embedding of hierarchical structure and context for entity disambiguation

Entity linking refers to the task of constructing links between the mentions of context and the description pages from knowledge base. Due to the polysemy phenomenon, the key issue of entity linking is entity disambiguation. To simplify the goal of entity disambiguation, the main problem is choosing the correct entity from candidates. In this paper, we propose a novel embedding method specifically designed for entity disambiguation. Existing distributed representations are limited in utilizing structured knowledge from knowledge bases such as Wikipedia. Our method jointly maps the information from hierarchical structure of knowledge and context words. We extend the continuous bags-of-words model by adding hierarchical categories and hyperlink structure. So far, we have trained a joint model which adds category information. We demonstrate the utility of our proposed approach on an entity relatedness dataset. The results show that our jointly embedding model is superior to the model simply using context words. In addition, we do disambiguation experiment on a dataset, and the results show slight superiority of the novel embedding model.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[3]  Hiroyuki Shindo,et al.  Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation , 2016, CoNLL.

[4]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[5]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[6]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[7]  Geoffrey E. Hinton,et al.  Learning Hierarchical Structures with Linear Relational Embedding , 2001, NIPS.

[8]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[9]  Geoffrey E. Hinton,et al.  Learning Distributed Representations of Concepts Using Linear Relational Embedding , 2001, IEEE Trans. Knowl. Data Eng..

[10]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[11]  Abdelmajid Ben Hamadou,et al.  Computing semantic relatedness using Wikipedia features , 2013, Knowl. Based Syst..

[12]  Iryna Gurevych,et al.  Analysis of the Wikipedia Category Graph for NLP Applications , 2007 .

[13]  Yiming Zhang,et al.  Evaluating semantic relatedness through categorical and contextual information for entity disambiguation , 2016, 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS).

[14]  Gerhard Weikum,et al.  KORE: keyphrase overlap relatedness for entity disambiguation , 2012, CIKM.

[15]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[16]  Salvatore Orlando,et al.  Learning relatedness measures for entity linking , 2013, CIKM.

[17]  Zhiting Hu,et al.  Joint Embedding of Hierarchical Categories and Entities for Concept Categorization and Dataless Classification , 2016, COLING.

[18]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[19]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.