论文信息 - Reservoir Stack Machines - 字舞流文

Reservoir Stack Machines

Memory-augmented neural networks equip a recurrent neural network with an explicit memory to support tasks that require information storage without interference over long times. A key motivation for such research is to perform classic computation tasks, such as parsing. However, memory-augmented neural networks are notoriously hard to train, requiring many backpropagation epochs and a lot of data. In this paper, we introduce the reservoir stack machine, a model which can provably recognize all deterministic contextfree languages and circumvents the training problem by training only the output layer of a recurrent net and employing auxiliary information during training about the desired interaction with a stack. In our experiments, we validate the reservoir stack machine against deep and shallow networks from the literature on three benchmark tasks for Neural Turing machines and six deterministic context-free languages. Our results show that the reservoir stack machine achieves zero error, even on test sequences longer than the training data, requiring only a few seconds of training time and 100 training sequences.

Barbara Hammer | Alexander Schulz | Benjamin Paassen

[1] Barbara Hammer,et al. Reservoir Memory Machines as Neural Computers , 2021, IEEE transactions on neural networks and learning systems.

[2] Alexander Clark,et al. Learning deterministic context free grammars: The Omphalos competition , 2006, Machine Learning.

[3] Peter Tiño,et al. Recurrent Neural Networks with Small Weights Implement Definite Memory Machines , 2003, Neural Computation.

[4] Jürgen Schmidhuber,et al. Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control , 2019, ICLR.

[5] Alexander Schulz,et al. Reservoir memory machines , 2020, ESANN.

[6] Wang Ling,et al. Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.

[7] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[8] Igor Farkas,et al. Computational analysis of memory capacity in echo state networks , 2016, Neural Networks.

[9] Daniel Kifer,et al. Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack , 2020, ICGI.

[10] Jöran Beel,et al. Implementing Neural Turing Machines , 2018, ICANN.

[11] Alexander M. Rush,et al. Unsupervised Recurrent Neural Network Grammars , 2019, NAACL.

[12] Ryo Yoshinaka,et al. Probabilistic learnability of context-free grammars with basic distributional properties from positive examples , 2016, Theor. Comput. Sci..

[13] Minoru Asada,et al. Information processing in echo state networks at the edge of chaos , 2011, Theory in Biosciences.

[14] C. Lee Giles,et al. Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[15] Noah A. Smith,et al. Recurrent Neural Network Grammars , 2016, NAACL.

[16] Alexander M. Rush,et al. Compound Probabilistic Context-Free Grammars for Grammar Induction , 2019, ACL.

[17] Donald E. Knuth,et al. On the Translation of Languages from Left to Right , 1965, Inf. Control..

[18] Noga Alon,et al. Efficient simulation of finite automata by neural nets , 1991, JACM.

[19] Ani Nenkova,et al. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2016, NAACL 2016.

[20] Barbara Hammer,et al. Theoretische Informatik - eine problemorientierte Einführung , 1996, Springer-Lehrbuch.

[21] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[22] Claudio Gallicchio,et al. Design of deep echo state networks , 2018, Neural Networks.

[23] Jǐŕı Š́ıma. Analog neuron hierarchy , 2020, Neural Networks.

[24] Yonatan Belinkov,et al. Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages , 2019, ArXiv.

[25] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[26] Colin de la Higuera,et al. Grammatical Inference: Learning Automata and Grammars , 2010 .

[27] H T Siegelmann,et al. Dating and Context of Three Middle Stone Age Sites with Bone Points in the Upper Semliki Valley, Zaire , 2007 .

[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29] Dana S. Scott,et al. Finite Automata and Their Decision Problems , 1959, IBM J. Res. Dev..

[30] Alex Graves,et al. Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes , 2016, NIPS.

[31] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[32] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[33] Claudio Gallicchio,et al. Architectural and Markovian factors of echo state networks , 2011, Neural Networks.

[34] Peter Tiño,et al. Predicting the Future of Discrete Sequences from Fractal Representations of the Past , 2001, Machine Learning.

[35] Hava T. Siegelmann,et al. On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[36] Aaron C. Courville,et al. Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.

[37] Peter Tiño,et al. Simple Deterministically Constructed Cycle Reservoirs with Regular Jumps , 2012, Neural Computation.

[38] Bartunov Sergey,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[39] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[40] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.

[41] Stefan J. Kiebel,et al. Re-visiting the echo state property , 2012, Neural Networks.

[42] Chris Eliasmith,et al. Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks , 2019, NeurIPS.

[43] Yang Liu,et al. Dependency Grammar Induction with a Neural Variational Transition-based Parser , 2018, AAAI.