暂无分享,去创建一个
Tom Schaul | Joel Z. Leibo | Max Jaderberg | Wojciech Czarnecki | Koray Kavukcuoglu | David Silver | Volodymyr Mnih | Wojciech M. Czarnecki | Max Jaderberg | T. Schaul | K. Kavukcuoglu | Volodymyr Mnih | David Silver
[1] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[2] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).
[3] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[4] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[5] Jianfeng Gao,et al. Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.
[6] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[7] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[8] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[9] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[10] R. N. Spreng,et al. The Future of Memory: Remembering, Imagining, and the Brain , 2012, Neuron.
[11] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[12] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[13] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[14] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[15] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[16] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[17] Shie Mannor,et al. Graying the black box: Understanding DQNs , 2016, ICML.
[18] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[19] David Silver,et al. Compositional Planning Using Optimal Option Models , 2012, ICML.
[20] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[21] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[22] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[23] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[24] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[25] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[26] Sergey Levine,et al. Model-based reinforcement learning with parametrized physical models and optimism-driven exploration , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[27] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[28] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[29] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[30] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[31] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.