Cross-lingual Entity Alignment for Knowledge Graphs with Incidental Supervision from Free Text

Much research effort has been put to multilingual knowledge graph (KG) embedding methods to address the entity alignment task, which seeks to match entities in different languagespecific KGs that refer to the same real-world object. Such methods are often hindered by the insufficiency of seed alignment provided between KGs. Therefore, we propose a new model, JEANS , which jointly represents multilingual KGs and text corpora in a shared embedding scheme, and seeks to improve entity alignment with incidental supervision signals from text. JEANS first deploys an entity grounding process to combine each KG with the monolingual text corpus. Then, two learning processes are conducted: (i) an embedding learning process to encode the KG and text of each language in one embedding space, and (ii) a self-learning based alignment learning process to iteratively induce the correspondence of entities and that of lexemes between embeddings. Experiments on benchmark datasets show that JEANS leads to promising improvement on entity alignment with incidental supervision, and significantly outperforms state-of-the-art methods that solely rely on internal information of KGs.

[1]  Sanjiv Kumar,et al.  On the Convergence of Adam and Beyond , 2018 .

[2]  Wei Hu,et al.  Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation , 2019, AAAI.

[3]  Chengjiang Li,et al.  Semi-supervised Entity Alignment via Joint Knowledge Embedding Model and Cross-graph Model , 2019, EMNLP.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Yoshua Bengio,et al.  BilBOWA: Fast Bilingual Distributed Representations without Word Alignments , 2014, ICML.

[6]  Sarang Dharmapurikar,et al.  Longest prefix matching using bloom filters , 2006, IEEE/ACM Transactions on Networking.

[7]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[8]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[9]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[10]  Eric Fosler-Lussier,et al.  Jointly Embedding Entities and Text with Distant Supervision , 2018, Rep4NLP@ACL.

[11]  L. Getoor,et al.  Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short , 2017, EMNLP.

[12]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[13]  Jingjing Xu,et al.  PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation , 2019, ArXiv.

[14]  Dan Roth,et al.  Incidental Supervision from Question-Answering Signals , 2019, ArXiv.

[15]  Wei Hu,et al.  TransEdge: Translating Relation-Contextualized Embeddings for Knowledge Graphs , 2019, SEMWEB.

[16]  Chris Quirk,et al.  Embedding Edge-attributed Relational Hierarchies , 2019, SIGIR.

[17]  Zhen Wang,et al.  Aligning Knowledge and Text Embeddings by Entity Descriptions , 2015, EMNLP.

[18]  Dan Roth,et al.  Joint Multilingual Supervision for Cross-lingual Entity Linking , 2018, EMNLP.

[19]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[20]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[21]  Anders Søgaard,et al.  A Survey of Cross-lingual Word Embedding Models , 2017, J. Artif. Intell. Res..

[22]  Dan Roth,et al.  Incidental Supervision: Moving beyond Supervised Learning , 2017, AAAI.

[23]  Rui Zhang,et al.  Entity Alignment between Knowledge Graphs Using Attribute Embeddings , 2019, AAAI.

[24]  Yizhou Sun,et al.  Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts , 2019, KDD.

[25]  Dongyan Zhao,et al.  Jointly Learning Entity and Relation Representations for Entity Alignment , 2019, EMNLP.

[26]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[29]  Regina Barzilay,et al.  Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing , 2019, NAACL.

[30]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[31]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[32]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[33]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[34]  Heng Ji,et al.  Multi-lingual Entity Discovery and Linking , 2018, ACL.

[35]  Christopher D. Manning,et al.  Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[36]  Wei Hu,et al.  Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs , 2019, ICML.

[37]  Yuzhong Qu,et al.  Multi-view Knowledge Graph Embedding for Entity Alignment , 2019, IJCAI.

[38]  Jimmy J. Lin,et al.  Aligning Cross-Lingual Entities with Multi-Aspect Information , 2019, EMNLP.

[39]  Jianshu Chen,et al.  Teaching Pretrained Models with Commonsense Reasoning: A Preliminary KB-Based Approach , 2019, ArXiv.

[40]  Xiang Ren,et al.  KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning , 2019, EMNLP.

[41]  Pei Zhou,et al.  Retrofitting Contextualized Word Embeddings with Paraphrases , 2019, EMNLP.

[42]  Tom M. Mitchell,et al.  PIDGIN: ontology alignment using web text as interlingua , 2013, CIKM.

[43]  Lu Yu,et al.  Semi-Supervised Entity Alignment via Knowledge Graph Embedding with Awareness of Degree Difference , 2019, WWW.

[44]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[45]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[46]  Wei Hu,et al.  Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding , 2017, SEMWEB.

[47]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[48]  Ian Horrocks,et al.  Large-scale Interactive Ontology Matching: Algorithms and Implementation , 2012, ECAI.

[49]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[50]  Steven Skiena,et al.  Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment , 2018, IJCAI.

[51]  Francis Bond,et al.  Linking and Extending an Open Multilingual Wordnet , 2013, ACL.

[52]  Nils Rethmeier,et al.  Learning Comment Controversy Prediction in Web Discussions Using Incidentally Supervised Multi-Task CNNs , 2018, WASSA@EMNLP.

[53]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[54]  Seung-won Hwang,et al.  Machine-Translated Knowledge Transfer for Commonsense Causal Reasoning , 2018, AAAI.

[55]  Andreas Spitz,et al.  Word Embeddings for Entity-annotated Texts , 2019, ECIR.

[56]  Claire Cardie,et al.  DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension , 2019, TACL.

[57]  Dan Roth,et al.  Unsupervised Sparse Vector Densification for Short Text Similarity , 2015, NAACL.

[58]  Zhen Wang,et al.  Knowledge Graph and Text Jointly Embedding , 2014, EMNLP.

[59]  Chengjiang Li,et al.  Multi-Channel Graph Neural Network for Entity Alignment , 2019, ACL.

[60]  Taku Kudo,et al.  MeCab : Yet Another Part-of-Speech and Morphological Analyzer , 2005 .

[61]  Carlo Zaniolo,et al.  Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment , 2016, IJCAI.

[62]  Dan Roth,et al.  Entity Linking via Joint Encoding of Types, Descriptions, and Context , 2017, EMNLP.

[63]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[64]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[65]  Zhichun Wang,et al.  Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks , 2018, EMNLP.

[66]  Hiroyuki Shindo,et al.  Learning Distributed Representations of Texts and Entities from Knowledge Base , 2017, TACL.

[67]  Stephen D. Mayhew,et al.  CogCompNLP: Your Swiss Army Knife for NLP , 2018, LREC.

[68]  Jian-Yun Nie,et al.  RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space , 2018, ICLR.

[69]  Xiaofei Zhou,et al.  Neighborhood-Aware Attentional Representation for Multilingual Knowledge Graphs , 2019, IJCAI.

[70]  Yanghua Xiao,et al.  Modeling Multi-mapping Relations for Precise Cross-lingual Entity Alignment , 2019, EMNLP.

[71]  Carlo Zaniolo,et al.  Multi-graph Affinity Embeddings for Multilingual Knowledge Graphs , 2017, AKBC@NIPS.

[72]  Zhiyuan Liu,et al.  Iterative Entity Alignment via Knowledge Embeddings , 2017 .

[73]  Yuting Wu,et al.  Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs , 2019, IJCAI.

[74]  Xiangliang Zhang,et al.  Improving Cross-lingual Entity Alignment via Optimal Transport , 2019, IJCAI.

[75]  Serge Abiteboul,et al.  PARIS: Probabilistic Alignment of Relations, Instances, and Schema , 2011, Proc. VLDB Endow..

[76]  Giuseppe Ottaviano,et al.  Space-efficient data structures for Top-k completion , 2013, WWW '13.

[77]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[78]  Xu Chen,et al.  Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding , 2017, ACL.

[79]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[80]  Axel-Cyrille Ngonga Ngomo,et al.  Machine Translation Using Semantic Web Technologies: A Survey , 2017, J. Web Semant..

[81]  Yasha Wang,et al.  COTSAE: CO-Training of Structure and Attribute Embeddings for Entity Alignment , 2020, AAAI.

[82]  Zhou Yu,et al.  Incorporating Structured Commonsense Knowledge in Story Completion , 2018, AAAI.

[83]  Yansong Feng,et al.  Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network , 2019, ACL.

[84]  Wei Hu,et al.  Bootstrapping Entity Alignment with Knowledge Graph Embedding , 2018, IJCAI.