Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations

Most existing recursive neural network (RvNN) architectures utilize only the structure of parse trees, ignoring syntactic tags which are provided as by-products of parsing. We present a novel RvNN architecture that can provide dynamic compositionality by considering comprehensive syntactic information derived from both the structure and linguistic tags. Specifically, we introduce a structure-aware tag representation constructed by a separate tag-level tree-LSTM. With this, we can control the composition function of the existing wordlevel tree-LSTM by augmenting the representation as a supplementary input to the gate functions of the tree-LSTM. In extensive experiments, we show that models built upon the proposed architecture obtain superior or competitive performance on several sentence-level tasks such as sentiment analysis and natural language inference when compared against previous tree-structured models and other sophisticated neural models.

[1]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[2]  Xu Sun,et al.  Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification , 2017, IJCNLP.

[3]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[4]  Jiancheng Li,et al.  TreeNet: Learning Sentence Representations with Unconstrained Tree Structure , 2018, IJCAI.

[5]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[6]  Yue Zhang,et al.  Head-Lexicalized Bidirectional Tree LSTMs , 2017, TACL.

[7]  Zhen-Hua Ling,et al.  Enhancing Sentence Embedding with Generalized Pooling , 2018, COLING.

[8]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[9]  Wang Ling,et al.  Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.

[10]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[11]  Zhi Jin,et al.  Discriminative Neural Sentence Modeling by Tree-Based Convolution , 2015, EMNLP.

[12]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[15]  Stephen Clark,et al.  Jointly learning sentence embeddings and syntax with unsupervised Tree-LSTMs , 2017, Natural Language Engineering.

[16]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[17]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[18]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[19]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[20]  Xiaoyan Zhu,et al.  Encoding Syntactic Knowledge in Neural Networks for Sentiment Classification , 2017, ACM Trans. Inf. Syst..

[21]  Jingbo Zhu,et al.  Towards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation , 2017, EMNLP.

[22]  Hong Yu,et al.  Neural Semantic Encoders , 2016, EACL.

[23]  Chengqi Zhang,et al.  Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling , 2018, IJCAI.

[24]  Stephen Clark,et al.  Latent Tree Learning with Differentiable Parsers: Shift-Reduce Parsing and Chart Parsing , 2018, ArXiv.

[25]  Takashi Chikayama,et al.  Simple Customization of Recursive Neural Networks for Semantic Relation Classification , 2013, EMNLP.

[26]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[27]  Mohit Bansal,et al.  Shortcut-Stacked Sentence Encoders for Multi-Domain Inference , 2017, RepEval@EMNLP.

[28]  Frank Hutter,et al.  Fixing Weight Decay Regularization in Adam , 2017, ArXiv.

[29]  Ilya Sutskever,et al.  Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[30]  Jihun Choi,et al.  Learning to Compose Task-Specific Tree Structures , 2017, AAAI.

[31]  Xuanjing Huang,et al.  Dynamic Compositional Neural Networks over Tree Structure , 2017, IJCAI.

[32]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[35]  Xuanjing Huang,et al.  Adaptive Semantic Compositionality for Sentence Modelling , 2017, IJCAI.

[36]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[37]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[38]  Yang Liu,et al.  Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network , 2015, ACL.

[39]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[40]  Han Zhao,et al.  Self-Adaptive Hierarchical Sentence Model , 2015, IJCAI.

[41]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[42]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[43]  Christopher Potts,et al.  A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[44]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[46]  Ming Zhou,et al.  Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis , 2014, AAAI.

[47]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[48]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[49]  Yoshimasa Tsuruoka,et al.  Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.

[50]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[51]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.