Allen Newell,et al. The chess machine: an example of dealing with a complex task by adaptation , 1955, AFIPS '55 (Western).
 Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1959, IBM J. Res. Dev..
 Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
 Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..
 G. Loewenstein,et al. Time Discounting and Time Preference: A Critical Review , 2002 .
 W. K. Cullen,et al. Dopamine-dependent facilitation of LTP induction in hippocampal CA1 by exposure to spatial novelty , 2003, Nature Neuroscience.
 M. McDaniel,et al. Delaying execution of intentions: overcoming the costs of interruptions , 2004 .
 Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning , 2004, Machine Learning.
 N. Lemon,et al. Dopamine D1/D5 Receptors Gate the Acquisition of Novel Information through Hippocampal Long-Term Potentiation and Long-Term Depression , 2006, The Journal of Neuroscience.
 D. Hassabis,et al. Using Imagination to Understand the Neural Basis of Episodic Memory , 2007, The Journal of Neuroscience.
 D. Schacter,et al. Remembering the past to imagine the future: the prospective brain , 2007, Nature Reviews Neuroscience.
 Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
 Russ Tedrake,et al. Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms , 2008, NIPS.
 Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
 Aude Oliva,et al. Visual long-term memory has a massive storage capacity for object details , 2008, Proceedings of the National Academy of Sciences.
 John R. Anderson,et al. Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes , 2008, Psychological research.
 Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
 Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
 Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
 Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
 Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
 Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
 Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
 Francesco Visin,et al. A guide to convolution arithmetic for deep learning , 2016, ArXiv.
 N. Daw,et al. Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework , 2017, Annual review of psychology.
 Shane Legg,et al. Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents , 2018, ArXiv.
 J. Pearl,et al. The Book of Why: The New Science of Cause and Effect , 2018 .
 Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
 Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
 Zeb Kurth-Nelson,et al. Been There, Done That: Meta-Learning with Episodic Recall , 2018, ICML.
 Mélanie Frappier. The Book of Why: The New Science of Cause and Effect , 2018, Science.
 Christopher Joseph Pal,et al. Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding , 2018, NeurIPS.
 Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2019, ICLR.