MulDE: Multi-teacher Knowledge Distillation for Low-dimensional Knowledge Graph Embeddings

Link prediction based on knowledge graph embedding (KGE) aims to predict new triples to complete knowledge graphs (KGs) automatically. However, recent KGE models tend to improve performance by excessively increasing vector dimensions, which would cause enormous training costs and save storage in practical applications. To address this problem, we first theoretically analyze the capacity of low-dimensional space for KG embeddings based on the principle of minimum entropy. Then, we propose a novel knowledge distillation framework for knowledge graph embedding, utilizing multiple low-dimensional KGE models as teachers. Under a novel iterative distillation strategy, the MulDE model produces soft labels according to training epochs and student performance adaptively. The experimental results show that MulDE can effectively improve the performance and training speed of low-dimensional KGE models. The distilled 32-dimensional models are very competitive compared to some of state-or-the-art (SotA) high-dimensional methods on several commonly-used datasets.

[1]  Jianqing Fan,et al.  Distributions of angles in random packing on spheres , 2013, J. Mach. Learn. Res..

[2]  Danqi Chen,et al.  Observed versus latent features for knowledge base and text inference , 2015, CVSC.

[3]  Mrinmaya Sachan,et al.  Knowledge Graph Embedding Compression , 2020, ACL.

[4]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[5]  Jianhua Tao,et al.  ParamE: Regarding Neural Network Parameters as Relation Embeddings for Knowledge Graph Completion , 2020, AAAI.

[6]  Jian-Yun Nie,et al.  RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space , 2018, ICLR.

[7]  Huajun Chen,et al.  Relation Adversarial Network for Low Resource Knowledge Graph Completion , 2020, WWW.

[8]  Xiaofei Zhou,et al.  Neighborhood-Aware Attentional Representation for Multilingual Knowledge Graphs , 2019, IJCAI.

[9]  Alan L. Yuille,et al.  Training Deep Neural Networks in Generations: A More Tolerant Teacher Educates Better Students , 2018, AAAI.

[10]  Volker Tresp,et al.  Ensemble Solutions for Link-Prediction in Knowledge Graphs , 2015 .

[11]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[12]  Xiaodong He,et al.  Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding , 2020, ACL.

[13]  Zachary Chase Lipton,et al.  Born Again Neural Networks , 2018, ICML.

[14]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[15]  Dai Quoc Nguyen,et al.  A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network , 2017, NAACL.

[16]  Qiaozhe Li,et al.  Pedestrian Attribute Recognition by Joint Visual-semantic Reasoning and Knowledge Distillation , 2019, IJCAI.

[17]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[18]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[19]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[20]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[21]  Lina Yao,et al.  Quaternion Knowledge Graph Embeddings , 2019, NeurIPS.

[22]  Hui Li,et al.  On Multi-Relational Link Prediction with Bilinear Models , 2017, AAAI.

[23]  Jason Weston,et al.  A semantic matching energy function for learning with multi-relational data , 2013, Machine Learning.

[24]  Yixin Cao,et al.  Reinforced Negative Sampling over Knowledge Graph for Recommendation , 2020, WWW.

[25]  Heiko Paulheim,et al.  How much is a Triple? Estimating the Cost of Knowledge Graph Creation , 2018, SEMWEB.

[26]  Christopher R'e,et al.  Low-Dimensional Hyperbolic Knowledge Graph Embeddings , 2020, ACL.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[29]  Seyed Iman Mirzadeh,et al.  Improved Knowledge Distillation via Teacher Assistant , 2020, AAAI.

[30]  Chengkai Li,et al.  Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study , 2020, SIGMOD Conference.