Adaptive knowledge subgraph ensemble for robust and trustworthy knowledge graph completion

Knowledge graph (KG) embedding approaches are widely used to infer underlying missing facts based on intrinsic structure information. However, the presence of noisy facts in automatically extracted or crowdsourcing KGs significantly reduces the reliability of various embedding learners. In this paper, we thoroughly study the underlying reasons for the performance drop in dealing with noisy knowledge graphs, and we propose an ensemble framework, Adaptive Knowledge Subgraph Ensemble ( AKSE ), to enhance the robustness and trust of knowledge graph completion. By employing an effective knowledge subgraph extraction approach to re-sample the sub-components from the original knowledge graph, AKSE generates different representations for learning diversified base learners (e.g., TransE and DistMult ), which substantially alleviates the noise effect of KG embedding. All embedding learners are integrated into a unified framework to reduce generalization errors via our simple or adaptive weighting schemes, where the weight is allocated based on each individual learner’s prediction capacity. Experimental results show that the robustness of our ensemble framework outperforms exiting knowledge graph embedding approaches on manually injected noise as well as inherent noisy extracted KGs.

[1]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[2]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[5]  L. Getoor,et al.  Sparsity and Noise: Where Knowledge Graph Embeddings Fall Short , 2017, EMNLP.

[6]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[7]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[8]  Jie Zhu,et al.  Effective and efficient trajectory outlier detection based on time-dependent popular route , 2016, World Wide Web.

[9]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[10]  Mehmet A. Orgun,et al.  Optimal Social Trust Path Selection in Complex Social Networks , 2010, AAAI.

[11]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[12]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[14]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[15]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[16]  Zhiyuan Liu,et al.  OpenKE: An Open Toolkit for Knowledge Embedding , 2018, EMNLP.

[17]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[18]  Rudolf Kadlec,et al.  Knowledge Base Completion: Baselines Strike Back , 2017, Rep4NLP@ACL.

[19]  Guillaume Bouchard,et al.  Knowledge Graph Completion via Complex Tensor Factorization , 2017, J. Mach. Learn. Res..

[20]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[21]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[22]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[23]  Yi Liu,et al.  Context-aware trust network extraction in large-scale trust-oriented social networks , 2017, World Wide Web.

[24]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[25]  Jie Wu,et al.  Online Task Assignment for Crowdsensing in Predictable Mobile Social Networks , 2017, IEEE Transactions on Mobile Computing.

[26]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[27]  Marie-Francine Moens,et al.  A survey on question answering technology from an information retrieval perspective , 2011, Inf. Sci..

[28]  Guanfeng Liu,et al.  Towards secure and truthful task assignment in spatial crowdsourcing , 2018, World Wide Web.

[29]  Feng Zhu,et al.  A Deep Framework for Cross-Domain and Cross-System Recommendations , 2018, IJCAI.

[30]  Maosong Sun,et al.  Does William Shakespeare REALLY Write Hamlet? Knowledge Representation Learning with Confidence , 2017, AAAI.

[31]  Mehmet A. Orgun,et al.  Finding the Optimal Social Trust Path for the Selection of Trustworthy Service Providers in Complex Social Networks , 2013, IEEE Transactions on Services Computing.

[32]  Yang Wang,et al.  MCS-GPM: Multi-Constrained Simulation Based Graph Pattern Matching in Contextual Social Graphs , 2018, IEEE Transactions on Knowledge and Data Engineering.

[33]  Xingquan Zhu,et al.  Class Noise vs. Attribute Noise: A Quantitative Study , 2003, Artificial Intelligence Review.

[34]  Chengqi Zhang,et al.  Learning Graph Embedding With Adversarial Training Methods , 2019, IEEE Transactions on Cybernetics.

[35]  Kai Zheng,et al.  A context-aware approach for trustworthy worker selection in social crowd , 2017, World Wide Web.

[36]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[37]  Naonori Ueda,et al.  Generalization error of ensemble estimators , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[38]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[39]  Xiangliang Zhang,et al.  Efficient task assignment in spatial crowdsourcing with worker and task privacy protection , 2018, GeoInformatica.

[40]  Raymond J. Mooney,et al.  Experiments on Ensembles with Missing and Noisy Data , 2004, Multiple Classifier Systems.

[41]  Fabio Roli,et al.  A Theoretical Analysis of Bagging as a Linear Combination of Classifiers , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Kai Ma,et al.  SRA: Secure Reverse Auction for Task Assignment in Spatial Crowdsourcing , 2020, IEEE Transactions on Knowledge and Data Engineering.