Learning Typed Entailment Graphs with Global Soft Constraints

This paper presents a new method for learning typed entailment graphs from text. We extract predicate-argument structures from multiple-source news corpora, and compute local distributional similarity scores to learn entailments between predicates with typed arguments (e.g., person contracted disease). Previous work has used transitivity constraints to improve local decisions, but these constraints are intractable on large graphs. We instead propose a scalable method that learns globally consistent similarity scores based on new soft constraints that consider both the structures across typed entailment graphs and inside each graph. Learning takes only a few hours to run over 100K predicates and our results show large improvements over local similarity scores on two entailment data sets. We further show improvements over paraphrases and entailments from the Paraphrase Database, and prior state-of-the-art entailment graphs. We show that the entailment graphs improve performance in a downstream task.

[1]  Omer Levy,et al.  Annotating Relation Inference in Context via Question Answering , 2016, ACL.

[2]  Mark Steedman,et al.  Large-scale Semantic Parsing without Question-Answer Pairs , 2014, TACL.

[3]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[4]  Ido Dagan,et al.  Similarity-Based Models of Word Cooccurrence Probabilities , 1998, Machine Learning.

[5]  Xavier Holt,et al.  Probabilistic Models of Relational Implication , 2019, ArXiv.

[6]  Gerhard Weikum,et al.  AIDA-light: High-Throughput Named-Entity Disambiguation , 2014, LDOW.

[7]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[8]  Mirella Lapata,et al.  Document Modeling with External Attention for Sentence Extraction , 2018, ACL.

[9]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[10]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[11]  Daniel S. Weld,et al.  Fine-Grained Entity Recognition , 2012, AAAI.

[12]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[13]  Dimitri Kartsaklis,et al.  Distributional Inclusion Hypothesis for Tensor-based Composition , 2016, COLING.

[14]  Ido Dagan,et al.  Efficient Tree-based Approximation for Entailment Graph Learning , 2012, ACL.

[15]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[16]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[17]  Robert Tibshirani,et al.  The Bootstrap Method for Assessing Statistical Accuracy , 1985 .

[18]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[19]  Hans Uszkoreit,et al.  Generating Pattern-Based Entailment Graphs for Relation Extraction , 2017, *SEM.

[20]  Mark Steedman,et al.  Combining Formal and Distributional Models of Temporal and Intensional Semantics , 2014, ACL 2014.

[21]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[22]  Terence Parsons,et al.  Events in the Semantics of English: A Study in Subatomic Semantics , 1990 .

[23]  Christopher D. Manning,et al.  Leveraging Linguistic Structure For Open Domain Information Extraction , 2015, ACL.

[24]  Mark Steedman,et al.  Combined Distributional and Logical Semantics , 2013, TACL.

[25]  Noga Alon,et al.  Efficient Global Learning of Entailment Graphs , 2015, CL.

[26]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[27]  Mark Steedman,et al.  Lexical Inference over Multi-Word Predicates: A Distributional Approach , 2014, ACL.

[28]  Oren Etzioni,et al.  Learning First-Order Horn Clauses from Web Text , 2010, EMNLP.

[29]  Ido Dagan,et al.  Crowdsourcing Inference-Rule Evaluation , 2012, ACL.

[30]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[31]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[32]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[33]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[34]  Mark Steedman,et al.  Unsupervised Induction of Cross-Lingual Semantic Relations , 2013, EMNLP.

[35]  Jonathan Berant,et al.  Building a Semantic Parser Overnight , 2015, ACL.

[36]  Ido Dagan,et al.  Learning Entailment Rules for Unary Templates , 2008, COLING.

[37]  Chris Callison-Burch,et al.  PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification , 2015, ACL.

[38]  Mirella Lapata,et al.  Learning to Paraphrase for Question Answering , 2017, EMNLP.

[39]  Daniel S. Weld,et al.  Harvesting Parallel News Streams to Generate Paraphrases of Event Relations , 2013, EMNLP.

[40]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[41]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[42]  Ido Dagan,et al.  Global Learning of Typed Entailment Rules , 2011, ACL.

[43]  Ido Dagan,et al.  The Distributional Inclusion Hypotheses and Lexical Entailment , 2005, ACL.

[44]  David J. Weir,et al.  A General Framework for Distributional Similarity , 2003, EMNLP.

[45]  Aurélie Herbelot,et al.  Measuring semantic content in distributional vectors , 2013, ACL.

[46]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..