Learning Functionally Decomposed Hierarchies for Continuous Control Tasks
暂无分享,去创建一个
[1] Sergey Levine,et al. Planning with Goal-Conditioned Policies , 2019, NeurIPS.
[2] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[3] Sergey Levine,et al. Search on the Replay Buffer: Bridging Planning and Reinforcement Learning , 2019, NeurIPS.
[4] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[5] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[6] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[7] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.
[8] Pushmeet Kohli,et al. Value Propagation Networks , 2018, ICLR.
[9] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[10] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[11] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[12] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[13] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[14] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[15] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[16] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[17] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[18] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[19] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[20] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[21] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[22] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[23] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[24] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[25] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[26] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[27] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[28] Mingrui Wu,et al. Gradient descent optimization of smoothed information retrieval metrics , 2010, Information Retrieval.
[29] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[30] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[31] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[32] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[33] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[34] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[35] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[36] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[37] Jürgen Schmidhuber,et al. Learning to generate subgoals for action sequences , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[38] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.