Enhancing Sentence Relation Modeling with Auxiliary Character-level Embedding

Neural network based approaches for sentence relation modeling automatically generate hidden matching features from raw sentence pairs. However, the quality of matching feature representation may not be satisfied due to complex semantic relations such as entailment or contradiction. To address this challenge, we propose a new deep neural network architecture that jointly leverage pre-trained word embedding and auxiliary character embedding to learn sentence meanings. The two kinds of word sequence representations as inputs into multi-layer bidirectional LSTM to learn enhanced sentence representation. After that, we construct matching features followed by another temporal CNN to learn high-level hidden matching feature representations. Experimental results demonstrate that our approach consistently outperforms the existing methods on standard evaluation datasets.

[1]  Quoc V. Le,et al.  Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[2]  Richard Socher,et al.  A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.

[3]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[4]  Alexander F. Gelbukh,et al.  UNAL-NLP: Combining Soft Cardinality Features for Semantic Textual Similarity, Relatedness and Entailment , 2014, *SEMEVAL.

[5]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[6]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[7]  Stefan Evert,et al.  SemantiKLUE: Robust Semantic Similarity at Multiple Levels Using Maximum Weight Matching , 2014, *SEMEVAL.

[8]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[9]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[10]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Man Lan,et al.  ECNU: One Stone Two Birds: Ensemble of Heterogenous Measures for Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[13]  Wenpeng Yin,et al.  MultiGranCNN: An Architecture for General Matching of Text Chunks on Multiple Levels of Granularity , 2015, ACL.

[14]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[15]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[16]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[19]  Alice Lai,et al.  Illinois-LH: A Denotational and Distributional Approach to Semantics , 2014, *SEMEVAL.

[20]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[21]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[22]  Malvina Nissim,et al.  The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity , 2014, *SEMEVAL.