Random Semantic Tensor Ensemble for Scalable Knowledge Graph Link Prediction

Link prediction on knowledge graphs is useful in numerous application areas such as semantic search, question answering, entity disambiguation, enterprise decision support, recommender systems and so on. While many of these applications require a reasonably quick response and may operate on data that is constantly changing, existing methods often lack speed and adaptability to cope with these requirements. This is aggravated by the fact that knowledge graphs are often extremely large and may easily contain millions of entities rendering many of these methods impractical. In this paper, we address the weaknesses of current methods by proposing Random Semantic Tensor Ensemble (RSTE), a scalable ensemble-enabled framework based on tensor factorization. Our proposed approach samples a knowledge graph tensor in its graph representation and performs link prediction via ensembles of tensor factorization. Our experiments on both publicly available datasets and real world enterprise/sales knowledge bases have shown that our approach is not only highly scalable, parallelizable and memory efficient, but also able to increase the prediction accuracy significantly across all datasets.

[1]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[2]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[5]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[6]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[7]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[8]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[9]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[10]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[11]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[12]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[13]  Charu C. Aggarwal,et al.  Scaling up Link Prediction with Ensembles , 2016, WSDM.

[14]  Christos Faloutsos,et al.  GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries , 2012, KDD.

[15]  Nikos D. Sidiropoulos,et al.  ParCube: Sparse Parallelizable Tensor Decompositions , 2012, ECML/PKDD.

[16]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[17]  Jason Weston,et al.  A semantic matching energy function for learning with multi-relational data , 2013, Machine Learning.

[18]  K. Selçuk Candan,et al.  Focusing Decomposition Accuracy by Personalizing Tensor Decomposition (PTD) , 2014, CIKM.

[19]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.