Expanding Motor Skills using Relay Networks
暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Philippe Beaudoin,et al. Robust task-based control policies for physics-based characters , 2009, SIGGRAPH 2009.
[3] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[4] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[5] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[6] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[7] Sven Behnke,et al. Bayesian exploration and interactive demonstration in continuous state MAXQ-learning , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[8] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[9] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[10] Daniel E. Koditschek,et al. Sequential Composition of Dynamically Dexterous Robot Behaviors , 1999, Int. J. Robotics Res..
[11] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[12] Feng Wu,et al. Online planning for large MDPs with MAXQ decomposition , 2012, AAMAS.
[13] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[14] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..
[15] Eugene Fiume,et al. Domain of attraction expansion for physics-based character control , 2017, TOGS.
[16] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[17] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[18] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[19] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[20] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[23] Russ Tedrake,et al. LQR-trees: Feedback motion planning on sparse randomized trees , 2009, Robotics: Science and Systems.
[24] Jörg Stückler,et al. Getting Back on Two Feet: Reliable Standing-up Routines for a Humanoid Robot , 2006, IAS.
[25] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[26] M Vukobratović,et al. On the stability of biped locomotion. , 1970, IEEE transactions on bio-medical engineering.
[27] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[28] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[29] Sehoon Ha,et al. Learning a unified control policy for safe falling , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).