Context-Aware Dual-Attention Network for Natural Language Inference

Natural Language Inference (NLI) is a fundamental task in natural language understanding. In spite of the importance of existing research on NLI, the problem of how to exploit the contexts of sentences for more precisely capturing the inference relations (i.e. by addressing the issues such as polysemy and ambiguity) is still much open. In this paper, we introduce the corresponding image into inference process. Along this line, we design a novel Context-Aware Dual-Attention Network (CADAN) for tackling NLI task. To be specific, we first utilize the corresponding images as the Image Attention to construct an enriched representation for sentences. Then, we use the enriched representation as the Sentence Attention to analyze the inference relations from detailed perspectives. Finally, a sentence matching method is designed to determine the inference relation in sentence pairs. Experimental results on large-scale NLI corpora and real-world NLI alike corpus demonstrate the superior effectiveness of our CADAN model.

[1]  Lin Ma,et al.  Multimodal Convolutional Neural Networks for Matching Image and Sentence , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Grgoire Montavon,et al.  Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[3]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[4]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Zhen-Hua Ling,et al.  Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference , 2017, RepEval@EMNLP.

[6]  Enhong Chen,et al.  Question Difficulty Prediction for READING Problems in Standard Tests , 2017, AAAI.

[7]  Peter Clark,et al.  SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.

[8]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[9]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[10]  Yi Chen,et al.  Learning Context-Specific Word/Character Embeddings , 2017, AAAI.

[11]  Lior Wolf,et al.  Associating neural word embeddings with deep image representations using Fisher Vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Le Wu,et al.  Image-Enhanced Multi-level Sentence Representation Net for Natural Language Inference , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[13]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[14]  Yoshua Bengio,et al.  Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.

[15]  Omer Levy,et al.  Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.

[16]  Enhong Chen,et al.  A Context-Enriched Neural Network Method for Recognizing Lexical Entailment , 2017, AAAI.

[17]  Yang Liu,et al.  Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[18]  Xing Xie,et al.  Transcribing Content from Structural Images with Spotlight Mechanism , 2018, KDD.

[19]  Richard S. Zemel,et al.  Exploring Models and Data for Image Question Answering , 2015, NIPS.

[20]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[21]  Yonatan Bisk,et al.  Natural Language Inference from Multiple Premises , 2017, IJCNLP.

[22]  Mark Steedman,et al.  Interaction with context during human sentence processing , 1988, Cognition.

[23]  Hong Yu,et al.  Neural Tree Indexers for Text Understanding , 2016, EACL.

[24]  Wei Xu,et al.  Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[25]  David J. Weir,et al.  Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.

[26]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[27]  Tao Mei,et al.  Jointly Modeling Embedding and Translation to Bridge Video and Language , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[30]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[31]  Christopher D. Manning,et al.  Natural language inference , 2009 .

[32]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[33]  Oren Etzioni,et al.  Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions , 2016, AAAI.

[34]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[35]  Yi Zheng,et al.  Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding , 2016, AAAI.

[36]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[37]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[38]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[39]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.