Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery

Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals. This is especially true during the initial learning stages, when exploratory behaviour can increase the delay between specific actions and their effects. Many new or popular approaches for learning these distant correlations employ backpropagation through time (BPTT), but this technique requires storing observation traces long enough to span the interval between cause and effect. Besides memory demands, learning dynamics like vanishing gradients and slow convergence due to infrequent weight updates can reduce BPTT's practicality; meanwhile, although online recurrent network learning is a developing topic, most approaches are not efficient enough to use as replacements. We propose a simple, effective memory strategy that can extend the window over which BPTT can learn without requiring longer traces. We explore this approach empirically on a few tasks and discuss its implications.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Demis Hassabis,et al.  Neural Episodic Control , 2017, ICML.

[3]  Marc'Aurelio Ranzato,et al.  Learning Longer Memory in Recurrent Neural Networks , 2014, ICLR.

[4]  Joel Z. Leibo,et al.  Model-Free Episodic Control , 2016, ArXiv.

[5]  Jürgen Schmidhuber,et al.  A Clockwork RNN , 2014, ICML.

[6]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[7]  Helmut Hauser,et al.  Echo state networks with filter neurons and a delay&sum readout , 2010, Neural Networks.

[8]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[9]  Barnabás Póczos,et al.  The Statistical Recurrent Unit , 2017, ICML.

[10]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Max Jaderberg,et al.  Understanding Synthetic Gradients and Decoupled Neural Interfaces , 2017, ICML.

[12]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[13]  Herbert Jaeger,et al.  Optimization and applications of echo state networks with leaky- integrator neurons , 2007, Neural Networks.

[14]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[15]  Per B. Sederberg,et al.  Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world , 2017, ArXiv.

[16]  Alex Graves,et al.  Associative Long Short-Term Memory , 2016, ICML.

[17]  Alex Graves,et al.  Memory-Efficient Backpropagation Through Time , 2016, NIPS.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Jürgen Schmidhuber,et al.  Sequence Labelling in Structured Domains with Hierarchical Recurrent Neural Networks , 2007, IJCAI.

[20]  Peter Dayan,et al.  Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.

[21]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[22]  Barak A. Pearlmutter Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[23]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[24]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[25]  Noah A. Smith,et al.  Segmental Recurrent Neural Networks , 2015, ICLR.

[26]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[27]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[28]  Marc W Howard Memory as Perception of the Past: Compressed Time inMind and Brain , 2018, Trends in Cognitive Sciences.

[29]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[30]  Wang Ling,et al.  Character-based Neural Machine Translation , 2015, ArXiv.

[31]  Yann Ollivier,et al.  Unbiased Online Recurrent Optimization , 2017, ICLR.

[32]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[33]  Thorsten Brants,et al.  One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[34]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[35]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[36]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[37]  Yoshua Bengio,et al.  Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.

[38]  Yoshua Bengio,et al.  Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[39]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.