Learning to Compose over Tree Structures via POS Tags

Recursive Neural Network (RecNN), a type of models which compose words or phrases recursively over syntactic tree structures, has been proven to have superior ability to obtain sentence representation for a variety of NLP tasks. However, RecNN is born with a thorny problem that a shared compositional function for each node of trees can't capture the complex semantic compositionality so that the expressive power of model is limited. In this paper, in order to address this problem, we propose Tag-Guided HyperRecNN/TreeLSTM (TG-HRecNN/TreeLSTM), which introduces hypernetwork into RecNNs to take as inputs Part-of-Speech (POS) tags of word/phrase and generate the semantic composition parameters dynamically. Experimental results on five datasets for two typical NLP tasks show proposed models both obtain significant improvement compared with RecNN and TreeLSTM consistently. Our TG-HTreeLSTM outperforms all existing RecNN-based models and achieves or is competitive with state-of-the-art on four sentence classification benchmarks. The effectiveness of our models is also demonstrated by qualitative analysis.

[1]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[2]  ZhuXiaoyan,et al.  Encoding Syntactic Knowledge in Neural Networks for Sentiment Classification , 2017 .

[3]  Khalil Sima'an,et al.  Accurate Unlexicalized Parsing for Modern Hebrew , 2007, TSD.

[4]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[5]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[6]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[7]  Hong Yu,et al.  Neural Semantic Encoders , 2016, EACL.

[8]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[9]  Zhi Jin,et al.  Discriminative Neural Sentence Modeling by Tree-Based Convolution , 2015, EMNLP.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Luca Bertinetto,et al.  Learning feed-forward one-shot learners , 2016, NIPS.

[12]  M. Marelli,et al.  SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[13]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[14]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[15]  Rui Zhang,et al.  Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents , 2016, NAACL.

[16]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[17]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[18]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[19]  Xuanjing Huang,et al.  Adaptive Semantic Compositionality for Sentence Modelling , 2017, IJCAI.

[20]  Shay B. Cohen,et al.  Proceedings of ACL , 2013 .

[21]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[22]  Xiaoyan Zhu,et al.  Encoding Syntactic Knowledge in Neural Networks for Sentiment Classification , 2017, ACM Trans. Inf. Syst..

[23]  Zhi-Hong Deng,et al.  Inter-Weighted Alignment Network for Sentence Pair Modeling , 2017, EMNLP.

[24]  Xuanjing Huang,et al.  Dynamic Compositional Neural Networks over Tree Structure , 2017, IJCAI.

[25]  Bowen Zhou,et al.  Improved Representation Learning for Question Answer Matching , 2016, ACL.

[26]  Noah A. Smith,et al.  Proceedings of NIPS , 2010, NIPS 2010.

[27]  Noah A. Smith,et al.  Proceedings of EMNLP , 2007 .

[28]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[29]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[30]  Erhardt Barth,et al.  Recurrent Dropout without Memory Loss , 2016, COLING.

[31]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[32]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[33]  J. Urgen Schmidhuber Learning to Control Fast-weight Memories: an Alternative to Dynamic Recurrent Networks , 1991 .

[34]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[35]  Yue Zhang,et al.  Head-Lexicalized Bidirectional Tree LSTMs , 2017, TACL.

[36]  Xuanjing Huang,et al.  Idiom-Aware Compositional Distributed Semantics , 2017, EMNLP.

[37]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[38]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[39]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[40]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[41]  Ming Zhou,et al.  Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis , 2014, AAAI.

[42]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[43]  Yang Liu,et al.  Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network , 2015, ACL.

[44]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[45]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[46]  Han Zhao,et al.  Self-Adaptive Hierarchical Sentence Model , 2015, IJCAI.

[47]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.