Enhanced LSTM for Natural Language Inference

Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is very challenging. With the availability of large annotated data (Bowman et al., 2015), it has recently become feasible to train neural network based inference models, which have shown to be very effective. In this paper, we present a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset. Unlike the previous top models that use very complicated network architectures, we first demonstrate that carefully designing sequential inference models based on chain LSTMs can outperform all previous models. Based on this, we further show that by explicitly considering recursive architectures in both local inference modeling and inference composition, we achieve additional improvement. Particularly, incorporating syntactic parsing information contributes to our best result---it further improves the performance even when added to the already very strong model.

[1]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[2]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[3]  Biswajit Paria,et al.  A Neural Architecture Mimicking Humans End-to-End for Natural Language Inference , 2016, ArXiv.

[4]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[5]  Quoc V. Le,et al.  Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Yang Liu,et al.  Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[7]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[8]  Hong Yu,et al.  Neural Tree Indexers for Text Understanding , 2016, EACL.

[9]  Barbara H. Partee,et al.  Lexical semantics and compositionality. , 1995 .

[10]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[11]  Christopher D. Manning,et al.  Modeling Semantic Containment and Exclusion in Natural Language Inference , 2008, COLING.

[12]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[13]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[14]  Phong Le,et al.  Compositional Distributional Semantics with Long Short Term Memory , 2015, *SEMEVAL.

[15]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[16]  Kentaro Inui,et al.  Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing , 2007, ACL 2007.

[17]  Hongyu Guo,et al.  Long Short-Term Memory Over Recursive Structures , 2015, ICML.

[18]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[19]  Christopher D. Manning,et al.  Natural language inference , 2009 .

[20]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[21]  Zhen-Hua Ling,et al.  Distraction-Based Neural Networks for Modeling Document , 2016, IJCAI.

[22]  Fabio Massimo Zanzotto,et al.  Towards Syntax-aware Compositional Distributional Semantic Models , 2014, COLING.

[23]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[24]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[25]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[26]  Alessandro Moschitti,et al.  Syntactic/Semantic Structures for Textual Entailment Recognition , 2010, NAACL.

[27]  Zhifang Sui,et al.  Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition , 2016, COLING.

[28]  Li-Rong Dai,et al.  Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering , 2017, ArXiv.

[29]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[30]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[31]  Hongyu Guo,et al.  An Empirical Study on the Effect of Negation Words on Sentiment , 2014, ACL.

[32]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[33]  Zhen-Hua Ling,et al.  Distraction-based neural networks for modeling documents , 2016, IJCAI 2016.

[34]  Christopher Potts,et al.  A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Sanja Fidler,et al.  Order-Embeddings of Images and Language , 2015, ICLR.

[37]  Hong Yu,et al.  Neural Semantic Encoders , 2016, EACL.

[38]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[39]  Wei Lin,et al.  Revisiting Word Embedding for Contrasting Meaning , 2015, ACL.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.