暂无分享,去创建一个
[1] Elliot A. Ludvig,et al. From eye-blinks to state construction: Diagnostic benchmarks for online representation learning , 2020 .
[2] Richard S. Sutton,et al. Temporal Abstraction in Temporal-difference Networks , 2005, NIPS.
[3] Yann Ollivier,et al. Unbiased Online Recurrent Optimization , 2017, ICLR.
[4] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[5] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[6] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[7] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[8] Richard S. Sutton,et al. Continual Backprop: Stochastic Gradient Descent with Persistent Randomness , 2021, ArXiv.
[9] Richard S. Sutton,et al. Online Learning with Random Representations , 1993, ICML.
[10] I. Pavlov,et al. Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex , 2010, Annals of Neurosciences.
[11] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[12] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[13] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[14] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[15] Martha White,et al. General Value Function Networks , 2018, J. Artif. Intell. Res..
[16] M. Gabriel,et al. Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .
[17] Richard S. Sutton,et al. Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..
[18] Erich Elsen,et al. A Practical Sparse Approximation for Real Time Recurrent Learning , 2020, ArXiv.
[19] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[20] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[21] Elliot A. Ludvig,et al. Evaluating the TD model of classical conditioning , 2012, Learning & behavior.
[22] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[23] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[24] Richard S. Sutton,et al. Representation Search through Generate and Test , 2013, AAAI Workshop: Learning Rich Representations from Low-Level Sensors.
[25] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .