Aligning Cross-Lingual Entities with Multi-Aspect Information

Multilingual knowledge graphs (KGs), such as YAGO and DBpedia, represent entities in different languages. The task of cross-lingual entity alignment is to match entities in a source language with their counterparts in target languages. In this work, we investigate embedding-based approaches to encode entities from multilingual KGs into the same vector space, where equivalent entities are close to each other. Specifically, we apply graph convolutional networks (GCNs) to combine multi-aspect information of entities, including topological connections, relations, and attributes of entities, to learn entity embeddings. To exploit the literal descriptions of entities expressed in different languages, we propose two uses of a pretrained multilingual BERT model to bridge cross-lingual gaps. We further propose two strategies to integrate GCN-based and BERT-based modules to boost performance. Extensive experiments on two benchmark datasets demonstrate that our method significantly outperforms existing systems.

[1]  Jimmy Lin,et al.  Matching Entities Across Different Knowledge Graphs with Graph Embeddings , 2019, ArXiv.

[2]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[3]  Wei Hu,et al.  Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding , 2017, SEMWEB.

[4]  Steven Skiena,et al.  Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment , 2018, IJCAI.

[5]  Orhan Firat,et al.  Zero-Shot Cross-lingual Classification Using Multilingual Neural Machine Translation , 2018, ArXiv.

[6]  Axel-Cyrille Ngonga Ngomo,et al.  Machine Translation Using Semantic Web Technologies: A Survey , 2017, J. Web Semant..

[7]  Gerhard Weikum,et al.  YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames , 2016, SEMWEB.

[8]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[9]  Theodoros Rekatsinas,et al.  Deep Learning for Entity Matching: A Design Space Exploration , 2018, SIGMOD Conference.

[10]  Jun Zhao,et al.  A Joint Embedding Method for Entity Alignment of Knowledge Bases , 2016, CCKS.

[11]  Jimmy J. Lin,et al.  Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks , 2016, CIKM.

[12]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[13]  Jimmy J. Lin,et al.  Simple Attention-Based Representation Learning for Ranking Short Social Media Posts , 2018, NAACL.

[14]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[15]  Zhiyuan Liu,et al.  Iterative Entity Alignment via Joint Knowledge Embeddings , 2017, IJCAI.

[16]  Mark B. Sandler,et al.  Automatic Interlinking of Music Datasets on the Semantic Web , 2008, LDOW.

[17]  Maria Pershina,et al.  Holistic entity matching across knowledge graphs , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[18]  Léon Bottou,et al.  Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics , 2014, EMNLP.

[19]  Phil Blunsom,et al.  Multilingual Models for Compositional Distributed Semantics , 2014, ACL.

[20]  Holger Schwenk,et al.  Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.

[21]  Marie-Francine Moens,et al.  Imagined Visual Representations as Multimodal Embeddings , 2017, AAAI.

[22]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[23]  Carlo Zaniolo,et al.  Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment , 2016, IJCAI.

[24]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[26]  Guillaume Lample,et al.  XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.

[27]  Carlo Zaniolo,et al.  Multi-graph Affinity Embeddings for Multilingual Knowledge Graphs , 2017, AKBC@NIPS.

[28]  Wei Lu,et al.  Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning , 2019, TACL.

[29]  Kareem Darwish,et al.  Named Entity Recognition using Cross-lingual Resources: Arabic as an Example , 2013, ACL.

[30]  Wei Lu,et al.  Attention Guided Graph Convolutional Networks for Relation Extraction , 2019, ACL.

[31]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[32]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[33]  Martin Gaedke,et al.  Discovering and Maintaining Links on the Web of Data , 2009, SEMWEB.

[34]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[35]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[36]  Rui Zhang,et al.  Entity Alignment between Knowledge Graphs Using Attribute Embeddings , 2019, AAAI.

[37]  Philipp Cimiano,et al.  A Machine Learning Approach to Multilingual and Cross-Lingual Ontology Matching , 2011, SEMWEB.

[38]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[39]  Zhiyuan Liu,et al.  Representation Learning of Knowledge Graphs with Entity Descriptions , 2016, AAAI.

[40]  Juan-Zi Li,et al.  Cross-lingual knowledge linking across wiki knowledge bases , 2012, WWW.

[41]  Zhichun Wang,et al.  Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks , 2018, EMNLP.