论文信息 - Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

Developing robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, which can then be transferred to a real bipedal Cassie robot. To facilitate sim-to-real transfer, domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics. The learned policies enable Cassie to perform a set of diverse and dynamic behaviors, while also being more robust than traditional controllers and prior learning-based methods that use residual control. We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw. (Video1)

[1] Jessy W. Grizzle,et al. From 2D Design of Underactuated Bipedal Gaits to 3D Implementation: Walking With Speed Tracking , 2016, IEEE Access.

[2] Sehoon Ha,et al. Learning Fast Adaptation With Meta Strategy Optimization , 2020, IEEE Robotics and Automation Letters.

[3] Koushil Sreenath,et al. Dynamic Walking on Randomly-Varying Discrete Terrain with One-step Preview , 2017, Robotics: Science and Systems.

[4] Michiel van de Panne,et al. Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real , 2019, CoRL.

[5] Jie Tan,et al. Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, RSS 2020.

[6] Akshara Rai,et al. Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[7] Twan Koolen,et al. Summary of Team IHMC's virtual robotics challenge entry , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[8] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9] Russ Tedrake,et al. Whole-body motion planning with centroidal dynamics and full kinematics , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[10] Jessy W. Grizzle,et al. Rapid Trajectory optimization Using C-FROST with Illustration on a Cassie-Series Dynamic Walking Biped , 2018, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11] Lorenz Wellhausen,et al. Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[12] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[13] Glen Berseth,et al. Feedback Control For Cassie With Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14] C. Karen Liu,et al. Sim-to-Real Transfer for Biped Locomotion , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15] Aaron D. Ames,et al. Coupling Reduced Order Models via Feedback Control for 3D Underactuated Bipedal Robotic Walking , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[16] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[17] Koushil Sreenath,et al. Animated Cassie: A Dynamic Relatable Robotic Character , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18] Twan Koolen,et al. Capturability-based analysis and control of legged locomotion, Part 2: Application to M2V2, a lower-body humanoid , 2012, Int. J. Robotics Res..

[19] Scott Kuindersma,et al. An efficiently solvable quadratic program for stabilizing dynamic locomotion , 2013, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20] Jessy W. Grizzle,et al. Feedback Control of a Cassie Bipedal Robot: Walking, Standing, and Riding a Segway , 2018, 2019 American Control Conference (ACC).

[21] Weiwei Huang,et al. 3D walking based on online optimization , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[22] Joonho Lee,et al. Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[23] Alfred A. Rizzi,et al. Gaits and gait transitions for legged robots , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[24] Jun Morimoto,et al. Learning Biped Locomotion , 2007, IEEE Robotics & Automation Magazine.

[25] Sergey Levine,et al. Residual Reinforcement Learning for Robot Control , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[26] Twan Koolen,et al. Capturability-based analysis and control of legged locomotion, Part 1: Theory and application to three simple gait models , 2011, Int. J. Robotics Res..

[27] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[28] Christine Chevallereau,et al. 3D Bipedal Robotic Walking: Models, Feedback Control, and Open Problems , 2010 .

[29] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[30] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[31] Bowen Weng,et al. Hybrid Zero Dynamics Inspired Feedback Control Policy Design for 3D Bipedal Locomotion using Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[32] Miomir Vukobratovic,et al. Zero-Moment Point - Thirty Five Years of its Life , 2004, Int. J. Humanoid Robotics.

[33] Bowen Weng,et al. Velocity Regulation of 3D Bipedal Walking Robots with Uncertain Dynamics Through Adaptive Neural Network Controller , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..

[35] Alan Fern,et al. Learning Memory-Based Control for Human-Scale Bipedal Locomotion , 2020, Robotics: Science and Systems.

[36] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..

[37] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.