论文信息 - Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery

Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery

Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals. This is especially true during the initial learning stages, when exploratory behaviour can increase the delay between specific actions and their effects. Many new or popular approaches for learning these distant correlations employ backpropagation through time (BPTT), but this technique requires storing observation traces long enough to span the interval between cause and effect. Besides memory demands, learning dynamics like vanishing gradients and slow convergence due to infrequent weight updates can reduce BPTT's practicality; meanwhile, although online recurrent network learning is a developing topic, most approaches are not efficient enough to use as replacements. We propose a simple, effective memory strategy that can extend the window over which BPTT can learn without requiring longer traces. We explore this approach empirically on a few tasks and discuss its implications.

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.

[3] Marc'Aurelio Ranzato,et al. Learning Longer Memory in Recurrent Neural Networks , 2014, ICLR.

[4] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.

[5] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.

[6] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[7] Helmut Hauser,et al. Echo state networks with filter neurons and a delay&sum readout , 2010, Neural Networks.

[8] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[9] Barnabás Póczos,et al. The Statistical Recurrent Unit , 2017, ICML.

[10] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11] Max Jaderberg,et al. Understanding Synthetic Gradients and Decoupled Neural Interfaces , 2017, ICML.

[12] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.

[13] Herbert Jaeger,et al. Optimization and applications of echo state networks with leaky- integrator neurons , 2007, Neural Networks.

[14] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[15] Per B. Sederberg,et al. Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world , 2017, ArXiv.

[16] Alex Graves,et al. Associative Long Short-Term Memory , 2016, ICML.

[17] Alex Graves,et al. Memory-Efficient Backpropagation Through Time , 2016, NIPS.

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] Jürgen Schmidhuber,et al. Sequence Labelling in Structured Domains with Hierarchical Recurrent Neural Networks , 2007, IJCAI.

[20] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.

[21] Peter Tiño,et al. Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[22] Barak A. Pearlmutter. Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.