Knowledge Base Completion Using Matrix Factorization

With the development of Semantic Web, the automatic construction of large scale knowledge bases (KBs) has been receiving increasing attention in recent years. Although these KBs are very large, they are still often incomplete. Many existing approaches to KB completion focus on performing inference over a single KB and suffer from the feature sparsity problem. Moreover, traditional KB completion methods ignore complementarity which exists in various KBs implicitly. In this paper, we treat KBs completion as a large matrix completion task and integrate different KBs to infer new facts simultaneously. We present two improvements to the quality of inference over KBs. First, in order to reduce the data sparsity, we utilize the type consistency constraints between relations and entities to initialize negative data in the matrix. Secondly, we incorporate the similarity of relations between different KBs into matrix factorization model to take full advantage of the complementarity of various KBs. Experimental results show that our approach performs better than methods that consider only existing facts or only a single knowledge base, achieving significant accuracy improvements in binary relation prediction.

[1]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[2]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[3]  Ni Lao,et al.  Reading The Web with Learned Syntactic-Semantic Inference Rules , 2012, EMNLP.

[4]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[5]  Tom M. Mitchell,et al.  Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases , 2014, EMNLP.

[6]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[7]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[8]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[9]  William W. Cohen,et al.  Polynomial learnability and Inductive Logic Programming: Methods and results , 1995, New Generation Computing.

[10]  Ralph Grishman,et al.  Distant Supervision for Relation Extraction with an Incomplete Knowledge Base , 2013, NAACL.

[11]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[12]  Andrew McCallum,et al.  Probabilistic Databases of Universal Schema , 2012, AKBC-WEKEX@NAACL-HLT.

[13]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[14]  Tom M. Mitchell,et al.  Improving Learning and Inference in a Large Knowledge-Base using Latent Syntactic Cues , 2013, EMNLP.

[15]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[16]  Andrew McCallum,et al.  Universal schema for entity type prediction , 2013, AKBC '13.

[17]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.