Prediction problems inspired by animal learning

[1]  C. L. Hull The problem of stimulus equivalence in behavior theory. , 1939 .

[2]  N. Schneiderman Interstimulus interval function of the nictitating membrane response of the rabbit under delay versus trace conditioning. , 1966 .

[3]  N. Mackintosh The psychology of animal learning , 1974 .

[4]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[5]  Michael C. Mozer,et al.  A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[6]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[7]  Richard S. Sutton,et al.  Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Richard S. Sutton,et al.  Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System , 2008, Neural Computation.

[10]  Richard S. Sutton,et al.  A computational model of hippocampal function in trace conditioning , 2008, NIPS.

[11]  Justin A. Harris,et al.  Negative patterning is easier than a biconditional discrimination. , 2008, Journal of experimental psychology. Animal behavior processes.

[12]  Charles R. Gallistel,et al.  Memory and the Computational Brain: Why Cognitive Science will Transform Neuroscience , 2009 .

[13]  Elliot A. Ludvig,et al.  Evaluating the TD model of classical conditioning , 2012, Learning & behavior.

[14]  Yoshua Bengio,et al.  Conditioning and time representation in long short-term memory networks , 2013, Biological Cybernetics.

[15]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[16]  Richard S. Sutton,et al.  Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..

[17]  Richard S. Sutton,et al.  Learning to Predict Independent of Span , 2015, ArXiv.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.

[20]  Chrissy M Chubala,et al.  Intertrial unconditioned stimuli differentially impact trace conditioning , 2017, Learning & behavior.

[21]  Allan R. Wagner,et al.  Expectancies and the Priming of STM , 2018 .

[22]  Marlos C. Machado,et al.  Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..

[23]  Joel Z. Leibo,et al.  Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.

[24]  André Luzardo,et al.  The Rescorla-Wagner Drift-Diffusion model , 2018 .

[25]  Joel Z. Leibo,et al.  Generalization of Reinforcement Learners with Working and Episodic Memory , 2019, NeurIPS.