论文信息 - Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences - 字舞流文

Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences

We propose a hierarchical model for sequential data that learns a tree on-the-fly, i.e. while reading the sequence. In the model, a recurrent network adapts its structure and reuses recurrent weights in a recursive manner. This creates adaptive skip-connections that ease the learning of long-term dependencies. The tree structure can either be inferred without supervision through reinforcement learning, or learned in a supervised manner. We provide preliminary experiments in a novel Math Expression Evaluation (MEE) task, which is created to have a hierarchical tree structure that can be used to study the effectiveness of our model. Additionally, we test our model in a well-known propositional logic and language modelling tasks. Experimental results have shown the potential of our approach.

Yoshua Bengio | Alessandro Sordoni | Zhouhan Lin | Athul Paul Jacob

[1] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[2] Christopher Potts,et al. A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[3] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[4] Wang Ling,et al. Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.

[5] Samuel R. Bowman,et al. Learning to parse from a semantic objective: It works. Is it syntax? , 2017, ArXiv.

[6] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[7] Jihun Choi,et al. Learning to Compose Task-Specific Tree Structures , 2017, AAAI.

[8] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[9] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[10] Christopher D. Manning,et al. Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[12] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[13] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[14] Noah A. Smith,et al. Recurrent Neural Network Grammars , 2016, NAACL.

[15] Liang Lu,et al. Top-down Tree Long Short-Term Memory Networks , 2015, NAACL.

[16] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[17] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[18] Qing He,et al. Generative Neural Machine for Tree Structures , 2017, 1705.00321.

[19] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[20] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[21] Christopher Potts,et al. Tree-Structured Composition in Neural Networks without Tree-Structured Architectures , 2015, CoCo@NIPS.

[22] Christof Monz,et al. The Importance of Being Recurrent for Modeling Hierarchical Structure , 2018, EMNLP.