Learning Entailment-Based Sentence Embeddings from Natural Language Inference

Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences. But in many such models the embeddings computed by the sentence encoder goes through an MLP-based interaction layer before predicting its label, and thus some of the information about textual entailment is encoded in the interpretation of sentence embeddings given by this parameterised MLP. In this work we propose a simple interaction layer based on predefined entailment and contradiction scores applied directly to the sentence embeddings. This parameter-free interaction model achieves results on natural language inference competitive with MLP-based models, demonstrating that the trained sentence embeddings directly represent the information needed for textual entailment. The inductive bias of this model leads to better generalisation to other natural language inference datasets, and the resulting sentence embeddings are better for other semantic tasks.

[1]  Jian Zhang,et al.  Natural Language Inference over Interaction Space , 2017, ICLR.

[2]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[3]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4]  James Henderson,et al.  A Vector Space for Distributional Semantics for Entailment , 2016, ACL.

[5]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[6]  Yueting Zhuang,et al.  Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference , 2018, ACL.

[7]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[8]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[9]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[10]  Yonatan Belinkov,et al.  On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference , 2019, *SEMEVAL.

[11]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[12]  Guillaume Lample,et al.  XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.

[13]  Kyunghyun Cho,et al.  Dynamic Meta-Embeddings for Improved Sentence Representations , 2018, EMNLP.

[14]  Sheng Zhang,et al.  Ordinal Common-sense Inference , 2016, TACL.

[15]  Omer Levy,et al.  Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.

[16]  Christopher Potts,et al.  A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[17]  Yonatan Bisk,et al.  Natural Language Inference from Multiple Premises , 2017, IJCNLP.

[18]  Rachel Rudinger,et al.  Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation , 2018, BlackboxNLP@EMNLP.

[19]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.

[20]  Jin-Hyuk Hong,et al.  Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information , 2018, AAAI.

[21]  Stergios Chatzikyriakidis,et al.  Testing the Generalization Power of Neural Network Models across NLI Benchmarks , 2018, BlackboxNLP@ACL.

[22]  Andrew Hickl,et al.  LCC's GISTexter at DUC 2006: Multi-Strategy Multi-Document Summarization , 2006 .

[23]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[24]  Yonatan Belinkov,et al.  Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference , 2019, ACL.

[25]  Kevin Duh,et al.  Inference is Everything: Recasting Semantic Resources into a Unified Evaluation Framework , 2017, IJCNLP.

[26]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[27]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[28]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[29]  Rachel Rudinger,et al.  Hypothesis Only Baselines in Natural Language Inference , 2018, *SEMEVAL.

[30]  Peter Clark,et al.  SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.

[31]  Chris Callison-Burch,et al.  FrameNet+: Fast Paraphrastic Tripling of FrameNet , 2015, ACL.

[32]  Vincent Ng,et al.  Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge , 2012, EMNLP.

[33]  Daniel Jurafsky,et al.  Robust Machine Translation Evaluation with Entailment Features , 2009, ACL.

[34]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[35]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[36]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[37]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[38]  SangKeun Lee,et al.  Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding , 2018, ArXiv.

[39]  Douwe Kiela,et al.  SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.