Research on attention memory networks as a model for learning natural language inference

Natural Language Inference (NLI) is a fundamentally important task in natural language processing that has many applications. It is concerned with classifying the logical relation between two sentences. In this paper, we propose attention memory networks (AMNs) to recognize entailment and contradiction between two sentences. In our model, an attention memory neural network (AMNN) has a variable sized encoding memory and supports semantic compositionality. AMNN captures sentence level semantics and reasons relation between the sentence pairs; then we use a Sparsemax layer over the output of the generated matching vectors (sentences) for classification. Our experiments on the Stanford Natural Language Inference (SNLI) Corpus show that our model outperforms the state of the art, achieving an accuracy of 87.4% on the test data.

[1]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[2]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[3]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[4]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.

[7]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[8]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[9]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[10]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[11]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[12]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[13]  Rui Yan,et al.  Recognizing Entailment and Contradiction by Tree-based Convolution , 2015, ArXiv.

[14]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[15]  Ramón Fernández Astudillo,et al.  From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.

[16]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[17]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[18]  Wenpeng Yin,et al.  Convolutional Neural Network for Paraphrase Identification , 2015, NAACL.

[19]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[20]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[23]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[24]  Jimmy J. Lin,et al.  Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks , 2015, EMNLP.

[25]  Yang Liu,et al.  Implicit Discourse Relation Classification via Multi-Task Neural Networks , 2016, AAAI.

[26]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[27]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[28]  Xueqi Cheng,et al.  A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations , 2015, AAAI.

[29]  Christopher Potts,et al.  A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.