论文信息 - Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents - 字舞流文

Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents

Neural network based methods have obtained great progress on a variety of natural language processing tasks. However, it is still a challenge task to model long texts, such as sentences and documents. In this paper, we propose a multi-timescale long short-termmemory (MT-LSTM) neural network to model long texts. MTLSTM partitions the hidden states of the standard LSTM into several groups. Each group is activated at different time periods. Thus, MT-LSTM can model very long documents as well as short sentences. Experiments on four benchmark datasets show that our model outperforms the other neural models in text classification task.

Xuanjing Huang | Pengfei Liu | Xipeng Qiu | Shiyu Wu | Xinchi Chen | Xipeng Qiu | Pengfei Liu | Xuanjing Huang | Xinchi Chen | Shiyu Wu

[1] Geoffrey Zweig,et al. Spoken language understanding using long short-term memory neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[2] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[3] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[4] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[5] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[6] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[7] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[8] Dan Roth,et al. Learning Question Classifiers , 2002, COLING.

[9] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[10] Christopher D. Manning,et al. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[11] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[12] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[13] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..

[14] Andrew Y. Ng,et al. Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[15] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.

[16] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[17] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[18] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.

[20] Han Zhao,et al. Self-Adaptive Hierarchical Sentence Model , 2015, IJCAI.

[21] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[22] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[23] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[24] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[25] Xuanjing Huang,et al. Long Short-Term Memory Neural Networks for Chinese Word Segmentation , 2015, EMNLP.

[26] Misha Denil,et al. Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network , 2014, ArXiv.

[27] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[28] Mirella Lapata,et al. Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[29] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30] Hang Li,et al. Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[31] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[32] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[33] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[34] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[35] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.