CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content, and is a more difficult link prediction benchmark. Data, code, and pretrained models are available at this https URL.

[1]  Timothy M. Hospedales,et al.  TuckER: Tensor Factorization for Knowledge Graph Completion , 2019, EMNLP.

[2]  Geoffrey E. Hinton,et al.  Learning distributed representations of concepts. , 1989 .

[3]  S. Berg Snowball Sampling—I , 2006 .

[4]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[5]  Bin Wang,et al.  Adaptive Convolution for Multi-Relational Learning , 2019, NAACL.

[6]  Fabian M. Suchanek,et al.  Fast and Exact Rule Mining with AMIE 3 , 2020, ESWC.

[7]  Tim Weninger,et al.  ProjE: Embedding Projection for Knowledge Graph Completion , 2016, AAAI.

[8]  Zhiyuan Liu,et al.  Representation Learning of Knowledge Graphs with Hierarchical Types , 2016, IJCAI.

[9]  Fabian M. Suchanek,et al.  AMIE: association rule mining under incomplete evidence in ontological knowledge bases , 2013, WWW.

[10]  Alexander J. Smola,et al.  Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning , 2017, ICLR.

[11]  Antoine Bordes,et al.  Composing Relationships with Translations , 2015, EMNLP.

[12]  Li Guo,et al.  Knowledge Base Completion Using Embeddings and Rules , 2015, IJCAI.

[13]  Richard Socher,et al.  Multi-Hop Knowledge Graph Reasoning with Reward Shaping , 2018, EMNLP.

[14]  Zhao Zhang,et al.  Relational Graph Neural Network with Hierarchical Attention for Knowledge Graph Completion , 2020, AAAI.

[15]  Yuanzhuo Wang,et al.  Locally Adaptive Translation for Knowledge Graph Embedding , 2015, AAAI.

[16]  Dai Quoc Nguyen,et al.  A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization , 2018, NAACL.

[17]  Rudolf Kadlec,et al.  Knowledge Base Completion: Baselines Strike Back , 2017, Rep4NLP@ACL.

[18]  Han Xiao,et al.  TransG : A Generative Model for Knowledge Graph Embedding , 2015, ACL.

[19]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[20]  Mausam,et al.  Knowledge Base Completion: Baseline strikes back (Again) , 2020, ArXiv.

[21]  Li Guo,et al.  Knowledge Graph Embedding with Iterative Guidance from Soft Rules , 2017, AAAI.

[22]  Lina Yao,et al.  Quaternion Knowledge Graph Embeddings , 2019, NeurIPS.

[23]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[24]  Dai Quoc Nguyen,et al.  A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network , 2017, NAACL.

[25]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[26]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[27]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[28]  Lizhen Qu,et al.  STransE: a novel embedding model of entities and relationships in knowledge bases , 2016, NAACL.

[29]  Han Xiao,et al.  From One Point to a Manifold: Knowledge Graph Embedding for Precise Link Prediction , 2015, IJCAI.

[30]  William Yang Wang,et al.  KBGAN: Adversarial Learning for Knowledge Graph Embeddings , 2017, NAACL.

[31]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[32]  William Yang Wang,et al.  Learning First-Order Logic Embeddings via Matrix Factorization , 2016, IJCAI.

[33]  Andrew McCallum,et al.  A2N: Attending to Neighbors for Knowledge Graph Inference , 2019, ACL.

[34]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[35]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[36]  John Miller,et al.  Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[37]  Wenhan Xiong,et al.  DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning , 2017, EMNLP.

[38]  Les Carr,et al.  A Glimpse into Babel: An Analysis of Multilinguality in Wikidata , 2017, OpenSym.

[39]  Timothy M. Hospedales,et al.  Multi-relational Poincaré Graph Embeddings , 2019, NeurIPS.

[40]  Guillaume Bouchard,et al.  On Inductive Abilities of Latent Factor Models for Relational Learning , 2017, J. Artif. Intell. Res..

[41]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[42]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[43]  Ruijiang Li,et al.  Relation Embedding with Dihedral Group in Knowledge Graph , 2019, ACL.

[44]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[45]  Vladimir Batagelj,et al.  Fast algorithms for determining (generalized) core groups in social networks , 2011, Adv. Data Anal. Classif..

[46]  Yiming Yang,et al.  Analogical Inference for Multi-relational Embeddings , 2017, ICML.

[47]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[48]  Zhiyuan Liu,et al.  Knowledge Representation Learning with Entities, Attributes and Relations , 2016, IJCAI.

[49]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[50]  Ryutaro Ichise,et al.  TorusE: Knowledge Graph Embedding on a Lie Group , 2017, AAAI.

[51]  Vikram Nitin,et al.  InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions , 2020, AAAI.

[52]  Pouya Pezeshkpour,et al.  Revisiting Evaluation of Knowledge Base Completion Models , 2020, AKBC.

[53]  Wei Hu,et al.  Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs , 2019, ICML.

[54]  Philip S. Yu,et al.  A Survey on Knowledge Graphs: Representation, Acquisition and Applications , 2020, ArXiv.

[55]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[56]  Danqi Chen,et al.  Observed versus latent features for knowledge base and text inference , 2015, CVSC.

[57]  Rainer Gemulla,et al.  You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings , 2020, ICLR.

[58]  Manohar Kaul,et al.  Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs , 2019, ACL.

[59]  Guillaume Bouchard,et al.  On Approximate Reasoning Capabilities of Low-Rank Vector Spaces , 2015, AAAI Spring Symposia.

[60]  Jian-Yun Nie,et al.  RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space , 2018, ICLR.

[61]  Thomas Pellissier Tanon,et al.  From Freebase to Wikidata: The Great Migration , 2016, WWW.

[62]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[63]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[64]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[65]  Alexa T. McCray,et al.  An Upper-Level Ontology for the Biomedical Domain , 2003, Comparative and functional genomics.

[66]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[67]  Seyed Mehran Kazemi,et al.  SimplE Embedding for Link Prediction in Knowledge Graphs , 2018, NeurIPS.

[68]  Chengkai Li,et al.  Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study , 2020, SIGMOD Conference.

[69]  Heiner Stuckenschmidt,et al.  Fine-Grained Evaluation of Rule- and Embedding-Based Systems for Knowledge Graph Completion , 2018, SEMWEB.

[70]  Nicolas Usunier,et al.  Canonical Tensor Decomposition for Knowledge Base Completion , 2018, ICML.

[71]  Jun Zhao,et al.  Knowledge Graph Embedding via Dynamic Mapping Matrix , 2015, ACL.

[72]  Li Guo,et al.  Semantically Smooth Knowledge Graph Embedding , 2015, ACL.