Injecting Logical Background Knowledge into Embeddings for Relation Extraction

Matrix factorization approaches to relation extraction provide several attractive features: they support distant supervision, handle open schemas, and leverage unlabeled data. Unfortunately, these methods share a shortcoming with all other distantly supervised approaches: they cannot learn to extract target relations without existing data in the knowledge base, and likewise, these models are inaccurate for relations with sparse data. Rule-based extractors, on the other hand, can be easily extended to novel relations and improved for existing but inaccurate relations, through first-order formulae that capture auxiliary domain knowledge. However, usually a large set of such formulae is necessary to achieve generalization. In this paper, we introduce a paradigm for learning low-dimensional embeddings of entity-pairs and relations that combine the advantages of matrix factorization with first-order logic domain knowledge. We introduce simple approaches for estimating such embeddings, as well as a novel training algorithm to jointly optimize over factual and first-order logic information. Our results show that this method is able to learn accurate extractors with little or no distant supervision alignments, while at the same time generalizing to textual patterns that do not appear in the formulae.

[1]  Gideon S. Mann,et al.  Semi-supervised Learning of Dependency Parsers using Generalized Expectation Criteria , 2009, ACL/IJCNLP.

[2]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[3]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[4]  Mark Steedman,et al.  Combined Distributional and Logical Semantics , 2013, TACL.

[5]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[6]  Hans-Peter Kriegel,et al.  Factorizing YAGO: scalable machine learning for linked data , 2012, WWW.

[7]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[8]  Katrin Erk,et al.  Integrating Logical Representations with Probabilistic Information using Markov Logic , 2011, IWCS.

[9]  Johan Bos,et al.  Recognising Textual Entailment with Logical Inference , 2005, HLT.

[10]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[11]  Mirella Lapata,et al.  Unsupervised Relation Extraction with General Domain Knowledge , 2013, EMNLP.

[12]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[13]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[14]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[15]  Sameer Singh,et al.  Low-Dimensional Embeddings of Logic , 2014, ACL 2014.

[16]  Stephen Clark,et al.  Combining Symbolic and Distributional Models of Meaning , 2007, AAAI Spring Symposium: Quantum Interaction.

[17]  Edward Grefenstette,et al.  Towards a Formal Distributional Semantics: Simulating Logical Calculi with Tensors , 2013, *SEMEVAL.

[18]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[19]  Christoph Boden,et al.  Exploratory Relation Extraction in Large Text Corpora , 2014, COLING.

[20]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[21]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[22]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[23]  HippJochen,et al.  Algorithms for association rule mining a general survey and comparison , 2000 .

[24]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[25]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[26]  Bernhard Ganter,et al.  Completing Description Logic Knowledge Bases Using Formal Concept Analysis , 2007, IJCAI.

[27]  Pascal Hitzler,et al.  Logic programs and connectionist networks , 2004, J. Appl. Log..

[28]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[29]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[30]  Samuel R. Bowman Can recursive neural tensor networks learn logical reasoning? , 2014, ICLR.

[31]  Katrin Erk,et al.  Probabilistic Soft Logic for Semantic Textual Similarity , 2014, ACL.

[32]  Frederick Reiss,et al.  An Algebraic Approach to Rule-Based Information Extraction , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[33]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[34]  Sameer Singh,et al.  Minimally-Supervised Extraction of Entities from Text Advertisements , 2010, NAACL.

[35]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[36]  Andrew McCallum,et al.  Structured Relation Discovery using Generative Models , 2011, EMNLP.

[37]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[38]  Frederick Reiss,et al.  Rule-Based Information Extraction is Dead! Long Live Rule-Based Information Extraction Systems! , 2013, EMNLP.

[39]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[40]  Steffen Hölldobler,et al.  Approximating the Semantics of Logic Programs by Recurrent Neural Networks , 1999, Applied Intelligence.

[41]  Distant Supervision for Relation Extraction with Matrix Completion , 2014, ACL.

[42]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[43]  Phil Blunsom,et al.  The Role of Syntax in Vector Space Models of Compositional Semantics , 2013, ACL.

[44]  Oren Etzioni,et al.  Scaling Textual Inference to the Web , 2008, EMNLP.

[45]  Gideon S. Mann,et al.  Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields , 2008, ACL.

[46]  Ming-Wei Chang,et al.  Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[47]  Johan Bos,et al.  Wide-Coverage Semantic Analysis with Boxer , 2008, STEP.

[48]  Cuong Chau,et al.  Montague Meets Markov: Deep Semantics with Probabilistic Logical Form , 2013, *SEMEVAL.

[49]  Oren Etzioni,et al.  Learning First-Order Horn Clauses from Web Text , 2010, EMNLP.

[50]  Johanna Völker,et al.  Statistical Schema Induction , 2011, ESWC.

[51]  Yaser S. Abu-Mostafa,et al.  Learning from hints in neural networks , 1990, J. Complex..