You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings

Knowledge graph embedding (KGE) models learn algebraic representations of the entities and relations in a knowledge graph. A vast number of KGE techniques for multi-relational link prediction has been proposed in the recent literature, often with state-of-the-art performance. These approaches differ along a number of dimensions, including different model architectures, different training strategies, and different approaches to hyperparameter optimization. In this paper, we take a step back and aim to summarize and quantify empirically the impact of each of these dimensions on model performance. We report on the results of an extensive experimental study with popular model architectures and training strategies across a wide range of hyperparameter settings. We found that when trained appropriately, the relative performance differences between various model architectures often shrinks and sometimes even reverses when compared to prior results. For example, RESCAL, one of the first KGE models, showed strong performance when trained with state-of-the-art techniques; it was competitive to or outperformed more recent architectures. We also found that good (and often superior to prior studies) model configurations can be found by exploring relatively few random samples from a large hyperparameter space. Our results suggest that many of the more advanced architectures and techniques proposed in the literature should be revisited to reassess their individual benefits.

[1]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[2]  Li Guo,et al.  Knowledge Graph Embedding with Iterative Guidance from Soft Rules , 2017, AAAI.

[3]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[4]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[5]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[6]  Rainer Gemulla,et al.  On Evaluating Embedding Models for Knowledge Base Completion , 2018, RepL4NLP@ACL.

[7]  Bowen Zhou,et al.  End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion , 2018, AAAI.

[8]  Timothy M. Hospedales,et al.  TuckER: Tensor Factorization for Knowledge Graph Completion , 2019, EMNLP.

[9]  Allan Jabri,et al.  Learning Visual Features from Large Weakly Supervised Data , 2015, ECCV.

[10]  Yiming Yang,et al.  A Re-evaluation of Knowledge Graph Completion Methods , 2019, ACL.

[11]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[12]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[13]  Nicolas Usunier,et al.  Canonical Tensor Decomposition for Knowledge Base Completion , 2018, ICML.

[14]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[16]  Fabian M. Suchanek,et al.  AMIE: association rule mining under incomplete evidence in ontological knowledge bases , 2013, WWW.

[17]  Seyed Mehran Kazemi,et al.  SimplE Embedding for Link Prediction in Knowledge Graphs , 2018, NeurIPS.

[18]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[19]  Rudolf Kadlec,et al.  Knowledge Base Completion: Baselines Strike Back , 2017, Rep4NLP@ACL.

[20]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[21]  Heiner Stuckenschmidt,et al.  Anytime Bottom-Up Rule Learning for Knowledge Graph Completion , 2019, IJCAI.

[22]  Danqi Chen,et al.  Observed versus latent features for knowledge base and text inference , 2015, CVSC.

[23]  Manohar Kaul,et al.  Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs , 2019, ACL.

[24]  Iryna Gurevych,et al.  Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.

[25]  Vít Novácek,et al.  Loss Functions in Knowledge Graph Embedding Models , 2019, DL4KG@ESWC.

[26]  Chris Dyer,et al.  On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.

[27]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[28]  Volker Tresp,et al.  Improving Visual Relationship Detection Using Semantic Modeling of Scene Descriptions , 2017, SEMWEB.

[29]  Stephan Mandt,et al.  Probabilistic Knowledge Graph Embeddings , 2018 .

[30]  Dai Quoc Nguyen,et al.  A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network , 2017, NAACL.

[31]  Vivi Nastase,et al.  Analysis of the Impact of Negative Sampling on Link Prediction in Knowledge Graphs , 2017, ArXiv.

[32]  Jian-Yun Nie,et al.  RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space , 2018, ICLR.