Recognizing Textual Entailment with Attentive Reading and Writing Operations

Inferencing the entailment relations between natural language sentence pairs is fundamental to artificial intelligence. Recently, there is a rising interest in modeling the task with neural attentive models. However, those existing models have a major limitation to keep track of the attention history because usually only one single vector is utilized to memorize the past attention information. We argue its importance based on our observation that the potential alignment clues are not always centralized. Instead, they may diverge substantially, which could cause the problem of long-range dependency. In this paper, we propose to facilitate the conventional attentive reading operations with two sophisticated writing operations - forget and update. Instead of utilizing a single vector that accommodates the attention history, we write the past attention information directly into the sentence representations. Therefore, higher memory capacity of attention history could be achieved. Experiments on Stanford Natural Language Inference corpus (SNLI) demonstrate the superior efficacy of our proposed architecture.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Qun Liu,et al.  Neural Transformation Machine: A New Architecture for Sequence-to-Sequence Learning , 2015, ArXiv.

[3]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[6]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[7]  Pengfei Liu,et al.  Modelling Interaction of Sentence Pair with Coupled-LSTMs , 2016, EMNLP.

[8]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[9]  Christopher Potts,et al.  Learning Distributed Word Representations for Natural Logic Reasoning , 2014, AAAI Spring Symposia.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Zhifang Sui,et al.  Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition , 2016, COLING.

[12]  Ellie Pavlick,et al.  Compositional Lexical Semantics in Natural Language Inference , 2017 .

[13]  Qun Liu,et al.  Interactive Attention for Neural Machine Translation , 2016, COLING.

[14]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[15]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[16]  Yang Feng,et al.  Memory-augmented Neural Machine Translation , 2017, EMNLP.

[17]  Christopher Potts,et al.  Recursive Neural Networks for Learning Logical Semantics , 2014, ArXiv.

[18]  Qun Liu,et al.  Memory-enhanced Decoder for Neural Machine Translation , 2016, EMNLP.

[19]  Christopher D. Manning,et al.  Natural language inference , 2009 .

[20]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[21]  Xuanjing Huang,et al.  Deep Fusion LSTMs for Text Semantic Matching , 2016, ACL.

[22]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[23]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[24]  G. Lakoff Linguistics and natural logic , 1970, Synthese.

[25]  Tim Rocktäschel,et al.  Frustratingly Short Attention Spans in Neural Language Modeling , 2017, ICLR.

[26]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.