Strategic Attentive Writer for Learning Macro-Actions
暂无分享,去创建一个
Alex Graves | Simon Osindero | Koray Kavukcuoglu | Alexander Vezhnevets | Volodymyr Mnih | Oriol Vinyals | John Agapiou | Oriol Vinyals | J. Agapiou | A. Vezhnevets | K. Kavukcuoglu | Volodymyr Mnih | Simon Osindero | Alex Graves
[1] Doina Precup,et al. Planning with Closed Loop Macro Actions , 2008 .
[2] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[3] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[4] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[5] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[6] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[7] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[8] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[9] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[10] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[11] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[12] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[13] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[14] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[15] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[16] J. Urgen Schmidhuber. Neural Sequence Chunkers , 1991 .
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[19] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[20] G. Miller. Learning to Forget , 2004, Science.
[21] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[22] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[23] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[24] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[25] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[26] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[27] Michael C. Mozer,et al. A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..
[28] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[29] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[30] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[31] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[32] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.