Large-scale factorization of type-constrained multi-relational data

The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge bases like DBpedia, Freebase, YAGO and the recently introduced Google Knowledge Graph that contain millions of entities, hundreds and thousands of relations, and billions of relational tuples. Collective factorization methods have been shown to scale up to these large multi-relational datasets, in particular in form of tensor approaches that can exploit the highly scalable alternating least squares (ALS) algorithms for calculating the factors. In this paper we extend the recently proposed state-of-the-art RESCAL tensor factorization to consider relational type-constraints. Relational type-constraints explicitly define the logic of relations by excluding entities from the subject or object role. In addition we will show that in absence of prior knowledge about type-constraints, local closed-world assumptions can be approximated for each relation by ignoring unobserved subject or object entities in a relation. In our experiments on representative large datasets (Cora, DBpedia), that contain up to millions of entities and hundreds of type-constrained relations, we show that the proposed approach is scalable. It further significantly outperforms RESCAL without type-constraints in both, runtime and prediction quality.

[1]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[2]  R. Harshman The differences between analysis of covariance and correlation , 2001 .

[3]  Alan J. Laub,et al.  Matrix analysis - for scientists and engineers , 2004 .

[4]  Tamara G. Kolda,et al.  Higher-order Web link analysis using multilinear algebra , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[5]  Tamara G. Kolda,et al.  Temporal Analysis of Semantic Graphs Using ASALSAN , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[6]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[7]  Volker Tresp,et al.  Relation Prediction in Multi-Relational Domains using Matrix Factorization , 2008 .

[8]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[9]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[10]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[11]  Achim Rettinger,et al.  Materializing and Querying Learned Knowledge , 2009 .

[12]  Ryan P. Adams,et al.  Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes , 2010, UAI.

[13]  Bin Cao,et al.  Multi-Domain Collaborative Filtering , 2010, UAI.

[14]  Tamara G. Kolda,et al.  All-at-once Optimization for Coupled Matrix and Tensor Factorizations , 2011, ArXiv.

[15]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[16]  Achim Rettinger,et al.  Modeling and Learning Context-Aware Recommendation Scenarios Using Tensor Decomposition , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[17]  Deepak Agarwal,et al.  Localized factor models for multi-context recommendation , 2011, KDD.

[18]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[19]  Pascal Hitzler,et al.  Local Closed World Semantics: Grounded Circumscription for OWL , 2011, SEMWEB.

[20]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[21]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[22]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[23]  Guillaume Bouchard,et al.  Convex Collective Matrix Factorization , 2013, AISTATS.

[24]  Koh Takeuchi,et al.  Non-Negative Multiple Matrix Factorization , 2013, IJCAI.

[25]  Volker Tresp,et al.  Non-Negative Tensor Factorization with RESCAL , 2013 .

[26]  Volker Tresp,et al.  Tensor Factorization for Multi-relational Learning , 2013, ECML/PKDD.

[27]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[28]  Maximilian Nickel,et al.  Tensor factorization for relational learning , 2013 .

[29]  Theodoros Rekatsinas,et al.  Multi-relational Learning Using Weighted Tensor Decomposition with Modular Loss , 2013, ArXiv.

[30]  Volker Tresp,et al.  Logistic Tensor Factorization for Multi-Relational Data , 2013, ArXiv.

[31]  Koh Takeuchi,et al.  Non-negative Multiple Tensor Factorization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[32]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[33]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.