暂无分享,去创建一个
Sergey Levine | Sehoon Ha | Jie Tan | George Tucker | Aurick Zhou | Tuomas Haarnoja | S. Levine | G. Tucker | Tuomas Haarnoja | Sehoon Ha | Jie Tan | Aurick Zhou
[1] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[2] Roland Siegwart,et al. Practice Makes Perfect: An Optimization-Based Approach to Controlling Agile Motions for a Quadruped Robot , 2016, IEEE Robotics & Automation Magazine.
[3] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[4] Sergey Levine,et al. Composable Deep Reinforcement Learning for Robotic Manipulation , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[5] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[6] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[7] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[8] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[9] Christopher G. Atkeson,et al. Bayesian Optimization Using Domain Knowledge on the ATRIAS Biped , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.
[12] Joonho Lee,et al. Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.
[13] Sehoon Ha,et al. Automated Deep Reinforcement Learning Environment for Hardware of a Modular Legged Robot , 2018, 2018 15th International Conference on Ubiquitous Robots (UR).
[14] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[15] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[16] Ferdinando Cannella,et al. Design of HyQ – a hydraulically and electrically actuated quadruped robot , 2011 .
[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[18] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[19] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[20] Hannes Sommer,et al. Quadrupedal locomotion using hierarchical operational space control , 2014, Int. J. Robotics Res..
[21] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[22] James Bergstra,et al. Benchmarking Reinforcement Learning Algorithms on Real-World Robots , 2018, CoRL.
[23] Emanuel Todorov,et al. General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.
[24] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[25] Sangbae Kim,et al. MIT Cheetah 3: Design and Control of a Robust, Dynamic Quadruped Robot , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[26] Roland Siegwart,et al. Towards automatic discovery of agile gaits for quadrupedal robots , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[27] Glen Berseth,et al. Feedback Control For Cassie With Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[28] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[29] Atil Iscen,et al. Policies Modulating Trajectory Generators , 2018, CoRL.
[30] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[31] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[32] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[33] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[34] Sergey Levine,et al. Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning , 2017, ICLR.
[35] Glen Berseth,et al. Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control , 2018, ICLR.
[36] H. Sebastian Seung,et al. Learning to Walk in 20 Minutes , 2005 .
[37] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..
[38] Jan Peters,et al. Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.
[39] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[40] Sangbae Kim,et al. Dynamic Locomotion in the MIT Cheetah 3 Through Convex Model-Predictive Control , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[41] Koray Kavukcuoglu,et al. PGQ: Combining policy gradient and Q-learning , 2016, ArXiv.
[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[43] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[44] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[45] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[46] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[47] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[48] Dale Schuurmans,et al. Trust-PCL: An Off-Policy Trust Region Method for Continuous Control , 2017, ICLR.
[49] Atil Iscen,et al. Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.
[50] Dale Schuurmans,et al. Smoothed Action Value Functions for Learning Gaussian Policies , 2018, ICML.
[51] Peter Fankhauser,et al. ANYmal - a highly mobile and dynamic quadrupedal robot , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[52] Taylor Apgar,et al. Fast Online Trajectory Optimization for the Bipedal Robot Cassie , 2018, Robotics: Science and Systems.
[53] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[54] Raia Hadsell,et al. Value constrained model-free continuous control , 2019, ArXiv.
[55] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[56] Daniel E. Koditschek,et al. Design Principles for a Family of Direct-Drive Legged Robots , 2016, IEEE Robotics and Automation Letters.
[57] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[58] Marc H. Raibert,et al. Legged Robots That Balance , 1986, IEEE Expert.