Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding

Entity alignment is the task of finding entities in two knowledge bases (KBs) that represent the same real-world object. When facing KBs in different natural languages, conventional cross-lingual entity alignment methods rely on machine translation to eliminate the language barriers. These approaches often suffer from the uneven quality of translations between languages. While recent embedding-based techniques encode entities and relationships in KBs and do not need machine translation for cross-lingual entity alignment, a significant number of attributes remain largely unexplored. In this paper, we propose a joint attribute-preserving embedding model for cross-lingual entity alignment. It jointly embeds the structures of two KBs into a unified vector space and further refines it by leveraging attribute correlations in the KBs. Our experimental results on real-world datasets show that this approach significantly outperforms the state-of-the-art embedding approaches for cross-lingual entity alignment and could be complemented with methods based on machine translation.

[1]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[2]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[3]  Volker Tresp,et al.  Type-Constrained Representation Learning in Knowledge Graphs , 2015, SEMWEB.

[4]  Heiko Paulheim,et al.  Entity Matching on Web Tables: a Table Embeddings approach for Blocking , 2017, EDBT.

[5]  Danqi Chen,et al.  Learning New Facts From Knowledge Bases With Neural Tensor Networks and Semantic Word Vectors , 2013, ICLR.

[6]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[7]  Declan O'Sullivan,et al.  Cross-Lingual Ontology Mapping and Its Use on the Multilingual Semantic Web , 2010, MSW.

[8]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[9]  Carlo Zaniolo,et al.  Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment , 2016, IJCAI.

[10]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11]  Declan O'Sullivan,et al.  Cross-Lingual Ontology Mapping - An Investigation of the Impact of Machine Translation , 2009, ASWC.

[12]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[13]  Dong Wang,et al.  Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.

[14]  Christopher D. Manning,et al.  Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[15]  Pascal Hitzler,et al.  String Similarity Metrics for Ontology Alignment , 2013, SEMWEB.

[16]  Goran Glavas,et al.  Improving Neural Knowledge Base Completion with Cross-Lingual Projections , 2017, EACL.

[17]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[18]  Juan-Zi Li,et al.  Cross-lingual knowledge linking across wiki knowledge bases , 2012, WWW.

[19]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[20]  Jun Zhao,et al.  A Joint Embedding Method for Entity Alignment of Knowledge Bases , 2016, CCKS.

[21]  Xiaocheng Feng,et al.  English-Chinese Knowledge Base Translation with Neural Network , 2016, COLING.

[22]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[23]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[24]  Zhiyuan Liu,et al.  Knowledge Representation Learning with Entities, Attributes and Relations , 2016, IJCAI.

[25]  Philipp Cimiano,et al.  A Machine Learning Approach to Multilingual and Cross-Lingual Ontology Matching , 2011, SEMWEB.