Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences

We propose a hierarchical model for sequential data that learns a tree on-the-fly, i.e. while reading the sequence. In the model, a recurrent network adapts its structure and reuses recurrent weights in a recursive manner. This creates adaptive skip-connections that ease the learning of long-term dependencies. The tree structure can either be inferred without supervision through reinforcement learning, or learned in a supervised manner. We provide preliminary experiments in a novel Math Expression Evaluation (MEE) task, which is created to have a hierarchical tree structure that can be used to study the effectiveness of our model. Additionally, we test our model in a well-known propositional logic and language modelling tasks. Experimental results have shown the potential of our approach.

[1]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[2]  Christopher Potts,et al.  A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[3]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[4]  Wang Ling,et al.  Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.

[5]  Samuel R. Bowman,et al.  Learning to parse from a semantic objective: It works. Is it syntax? , 2017, ArXiv.

[6]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[7]  Jihun Choi,et al.  Learning to Compose Task-Specific Tree Structures , 2017, AAAI.

[8]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[9]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[10]  Christopher D. Manning,et al.  Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[13]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[14]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[15]  Liang Lu,et al.  Top-down Tree Long Short-Term Memory Networks , 2015, NAACL.

[16]  Yoshua Bengio,et al.  Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[17]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[18]  Qing He,et al.  Generative Neural Machine for Tree Structures , 2017, 1705.00321.

[19]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[20]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[21]  Christopher Potts,et al.  Tree-Structured Composition in Neural Networks without Tree-Structured Architectures , 2015, CoCo@NIPS.

[22]  Christof Monz,et al.  The Importance of Being Recurrent for Modeling Hierarchical Structure , 2018, EMNLP.