Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction

We explore some of the practicalities of using random walk inference methods, such as the Path Ranking Algorithm (PRA), for the task of knowledge base completion. We show that the random walk probabilities computed (at great expense) by PRA provide no discernible benefit to performance on this task, so they can safely be dropped. This allows us to define a simpler algorithm for generating feature matrices from graphs, which we call subgraph feature extraction (SFE). In addition to being conceptually simpler than PRA, SFE is much more efficient, reducing computation by an order of magnitude, and more expressive, allowing for much richer features than paths between two nodes in a graph. We show experimentally that this technique gives substantially better performance than PRA and its variants, improving mean average precision from .432 to .528 on a knowledge base completion task using the NELL KB.

[1]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[2]  Xueyan Jiang,et al.  Reducing the Rank in Relational Factorization Models by Including Observable Patterns , 2014, NIPS.

[3]  Christian Bizer,et al.  DBpedia: A Multilingual Cross-domain Knowledge Base , 2012, LREC.

[4]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[5]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[6]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[7]  Antoine Bordes,et al.  Effective Blending of Two and Three-way Interactions for Modeling Multi-relational Data , 2014, ECML/PKDD.

[8]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[9]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[10]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[11]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[12]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[13]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[14]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[15]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[16]  Tom M. Mitchell,et al.  Improving Learning and Inference in a Large Knowledge-Base using Latent Syntactic Cues , 2013, EMNLP.

[17]  Matt Gardner,et al.  Combining Vector Space Embeddings with Symbolic Logical Inference over Open-Domain Text , 2015, AAAI Spring Symposia.

[18]  Jason Weston,et al.  Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction , 2013, EMNLP.

[19]  Tom M. Mitchell,et al.  Weakly Supervised Training of Semantic Parsers , 2012, EMNLP.

[20]  Ni Lao,et al.  Reading The Web with Learned Syntactic-Semantic Inference Rules , 2012, EMNLP.

[21]  Andrew McCallum,et al.  Compositional Vector Space Models for Knowledge Base Completion , 2015, ACL.

[22]  Eunsol Choi,et al.  Scalable Semantic Parsing with Partial Ontologies , 2015, ACL.

[23]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[24]  Fabian M. Suchanek Advances in Automated Knowledge Base Construction , 2013 .

[25]  William Yang Wang,et al.  Programming with personalized pagerank: a locally groundable first-order probabilistic logic , 2013, CIKM.

[26]  Rahul Gupta,et al.  Knowledge base completion via search-based question answering , 2014, WWW.

[27]  Ni Lao,et al.  Efficient Random Walk Inference with Knowledge Bases , 2012 .

[28]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[29]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[30]  Tom M. Mitchell,et al.  Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases , 2014, EMNLP.

[31]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.