FeUdal Networks for Hierarchical Reinforcement Learning
暂无分享,去创建一个
Tom Schaul | Max Jaderberg | Simon Osindero | Koray Kavukcuoglu | David Silver | Nicolas Heess | Alexander Sasha Vezhnevets | Max Jaderberg | A. Vezhnevets | T. Schaul | K. Kavukcuoglu | N. Heess | Simon Osindero | David Silver
[1] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[2] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[3] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[4] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[5] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[7] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[8] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[9] Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] J. Urgen Schmidhuber. Neural Sequence Chunkers , 1991 .
[12] Doina Precup,et al. Planning with Closed Loop Macro Actions , 2008 .
[13] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[14] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[16] DarrellTrevor,et al. End-to-end training of deep visuomotor policies , 2016 .
[17] R. Morris. Spatial Localization Does Not Require the Presence of Local Cues , 1981 .
[18] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[19] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[20] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[21] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[22] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[23] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[24] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[25] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[26] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[27] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[28] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[29] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[30] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[31] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.
[32] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[33] Michael C. Mozer,et al. A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..
[34] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[35] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.