论文信息 - Bidirectional Tree-Structured LSTM with Head Lexicalization

Bidirectional Tree-Structured LSTM with Head Lexicalization

Sequential LSTM has been extended to model tree structures, giving competitive results for a number of tasks. Existing methods model constituent trees by bottom-up combinations of constituent nodes, making direct use of input word information only for leaf nodes. This is different from sequential LSTMs, which contain reference to input words for each node. In this paper, we propose a method for automatic head-lexicalization for tree-structure LSTMs, propagating head words from leaf nodes to every constituent node. In addition, enabled by head lexicalization, we build a tree LSTM in the top-down direction, which corresponds to bidirectional sequential LSTM structurally. Experiments show that both extensions give better representations of tree structures. Our final model gives the best results on the Standford Sentiment Treebank and highly competitive results on the TREC question type classification task.

Yue Zhang | Zhiyang Teng | Yue Zhang | Zhiyang Teng

[1] Zhiyuan Liu,et al. A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[2] Dan Roth,et al. Learning Question Classifiers , 2002, COLING.

[3] Yue Zhang,et al. A Search-Based Dynamic Reranking Model for Dependency Parsing , 2016, ACL.

[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[6] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[7] Eduard H. Hovy,et al. When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.

[8] Jürgen Schmidhuber,et al. Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[9] Noah A. Smith,et al. Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[10] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[11] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[12] Noah A. Smith,et al. Recurrent Neural Network Grammars , 2016, NAACL.

[13] Dan Klein,et al. Accurate Unlexicalized Parsing , 2003, ACL.

[14] James R. Curran,et al. Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[15] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[16] Andrew Y. Ng,et al. Parsing with Compositional Vector Grammars , 2013, ACL.

[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[18] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[19] Makoto Miwa,et al. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[20] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.