Typed Tensor Decomposition of Knowledge Bases for Relation Extraction

While relation extraction has traditionally been viewed as a task relying solely on textual data, recent work has shown that by taking as input existing facts in the form of entity-relation triples from both knowledge bases and textual data, the performance of relation extraction can be improved significantly. Following this new paradigm, we propose a tensor decomposition approach for knowledge base embedding that is highly scalable, and is especially suitable for relation extraction. By leveraging relational domain knowledge about entity type information, our learning algorithm is significantly faster than previous approaches and is better able to discover new relations missing from the database. In addition, when applied to a relation extraction task, our approach alone is comparable to several existing systems, and improves the weighted mean average precision of a state-of-theart method by 10 points when used as a subcomponent.

[1]  Henry Lieberman,et al.  AnalogySpace: Reducing the Dimensionality of Common Sense Knowledge , 2008, AAAI.

[2]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[3]  Nikos D. Sidiropoulos,et al.  Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200x , 2014, SDM.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[6]  Alan J. Laub,et al.  Matrix analysis - for scientists and engineers , 2004 .

[7]  Andrew McCallum,et al.  Structured Relation Discovery using Generative Models , 2011, EMNLP.

[8]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[9]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[10]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[11]  Oren Etzioni,et al.  Modeling Missing Data in Distant Supervision for Information Extraction , 2013, TACL.

[12]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[13]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[14]  Kai-Wei Chang,et al.  Multi-Relational Latent Semantic Analysis , 2013, EMNLP.

[15]  Fabian M. Suchanek,et al.  Inside YAGO2s: a transparent information extraction architecture , 2013, WWW '13 Companion.

[16]  Tamara G. Kolda,et al.  MATLAB Tensor Toolbox , 2006 .

[17]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[18]  Jason Weston,et al.  A semantic matching energy function for learning with multi-relational data , 2013, Machine Learning.

[19]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[20]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[21]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[22]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[23]  Christos Faloutsos,et al.  GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries , 2012, KDD.

[24]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[25]  Jason Weston,et al.  Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction , 2013, EMNLP.

[26]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[27]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[28]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[29]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[30]  Tamara G. Kolda,et al.  Temporal Analysis of Semantic Graphs Using ASALSAN , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[31]  Joshua B. Tenenbaum,et al.  Modelling Relational Data using Bayesian Clustered Tensor Factorization , 2009, NIPS.

[32]  Theodoros Rekatsinas,et al.  Multi-relational Learning Using Weighted Tensor Decomposition with Modular Loss , 2013, ArXiv.

[33]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[34]  H. Kiers Towards a standardized notation and terminology in multiway analysis , 2000 .