Imagination-Augmented Agents for Deep Reinforcement Learning
暂无分享,去创建一个
Razvan Pascanu | Demis Hassabis | David Silver | Daan Wierstra | Nicolas Heess | Lars Buesing | Arthur Guez | Sébastien Racanière | Theophane Weber | Danilo Jimenez Rezende | Oriol Vinyals | David P. Reichert | Peter W. Battaglia | Adrià Puigdomènech Badia | Yujia Li | Oriol Vinyals | D. Hassabis | D. Silver | N. Heess | A. Guez | Daan Wierstra | T. Weber | Lars Buesing | P. Battaglia | Yujia Li | Razvan Pascanu | Sébastien Racanière | David Silver | S. Racanière | O. Vinyals
[1] E. Tolman. Cognitive maps in rats and men. , 1948, Psychological review.
[2] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[3] Jürgen Schmidhuber,et al. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[4] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[5] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[6] Hitoshi Matsubara,et al. Automatic Making of Sokoban Problems , 1996, PRICAI.
[7] B. Balleine,et al. The Role of Learning in the Operation of Motivational Systems , 2002 .
[8] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[9] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[10] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[11] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[12] D. Hassabis,et al. Patients with hippocampal amnesia cannot imagine new experiences , 2007, Proceedings of the National Academy of Sciences.
[13] Shane Legg,et al. Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.
[14] D. Hassabis,et al. Using Imagination to Understand the Neural Basis of Episodic Memory , 2007, The Journal of Neuroscience.
[15] Levente Kocsis,et al. Transpositions and move groups in Monte Carlo tree search , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.
[16] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[17] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[18] Christopher D. Rosin,et al. Nested Rollout Policy Adaptation for Monte Carlo Tree Search , 2011, IJCAI.
[19] Joshua Taylor,et al. Procedural Generation of Sokoban Levels , 2011 .
[20] R. N. Spreng,et al. The Future of Memory: Remembering, Imagining, and the Brain , 2012, Neuron.
[21] Brad E. Pfeiffer,et al. Hippocampal place cell sequences depict future paths to remembered goals , 2013, Nature.
[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[23] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[24] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[25] Jonathan P. How,et al. Real-World Reinforcement Learning via Multifidelity Simulators , 2015, IEEE Transactions on Robotics.
[26] Pieter Abbeel,et al. Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.
[27] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[28] Erik Talvitie,et al. Agnostic System Identification for Monte Carlo Planning , 2015, AAAI.
[29] Sergey Levine,et al. Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments , 2015, ArXiv.
[30] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[31] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[32] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[33] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.
[34] Jürgen Schmidhuber,et al. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models , 2015, ArXiv.
[35] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[36] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[37] Wojciech Zaremba,et al. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.
[38] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[39] Martial Hebert,et al. Improved Learning of Dynamics Models for Control , 2016, ISER.
[40] Katja Hofmann,et al. A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games , 2016, ICLR 2016.
[41] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[42] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[43] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[44] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[45] Razvan Pascanu,et al. Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.
[46] Andreas Krause,et al. Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[47] Sergey Levine,et al. Goal-driven dynamics learning via Bayesian optimization , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[48] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[49] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[50] Dileep George,et al. Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.
[51] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[52] Razvan Pascanu,et al. Learning model-based planning from scratch , 2017, ArXiv.
[53] Yann LeCun,et al. Model-Based Planning with Discrete and Continuous Actions , 2017 .
[54] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[55] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[56] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[57] Yann LeCun,et al. Model-Based Planning in Discrete Action Spaces , 2017, ArXiv.
[58] Sergey Levine,et al. Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).