Natural Language Inference Based on the LIC Architecture with DCAE Feature

Natural Language Inference (NLI), which is also known as Recognizing Textual Entailment (RTE), aims to identify the logical relationship between a premise and a hypothesis. In this paper, a DCAE (Directly-Conditional-Attention-Encoding) feature based on Bi-LSTM and a new architecture named LIC (LSTM-Interaction-CNN) is proposed to deal with the NLI task. In the proposed algorithm, Bi-LSTM layers are used to modeling sentences to obtain a DCAE feature, then the DCAE feature is reconstructed into images through an interaction layer to enrich the relevant information and make it possible to be dealt with convolutional layers, finally the CNN layers are applied to extract high-level relevant features and relation patterns and the discriminant result obtained through a MLP (Multi-Layer Perceptron). Advantages of LSTM layers in sequence information processing and CNN layers in feature extraction are fully combined in this proposed algorithm. Experiments show this model achieving state-of-the-art results on the SNLI and Multi-NLI datasets.

[1]  Siu Cheung Hui,et al.  Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference , 2017, EMNLP.

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[4]  Zhen-Hua Ling,et al.  Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference , 2017, RepEval@EMNLP.

[5]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[6]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[7]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[8]  Svetlana Lazebnik,et al.  Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, International Journal of Computer Vision.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Ido Dagan,et al.  PROBABILISTIC TEXTUAL ENTAILMENT: GENERIC APPLIED MODELING OF LANGUAGE VARIABILITY , 2004 .

[11]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[12]  Xiaoli Z. Fern,et al.  DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference , 2018, NAACL.

[13]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[14]  Yueting Zhuang,et al.  Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference , 2018, ACL.