Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

Developing robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, which can then be transferred to a real bipedal Cassie robot. To facilitate sim-to-real transfer, domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics. The learned policies enable Cassie to perform a set of diverse and dynamic behaviors, while also being more robust than traditional controllers and prior learning-based methods that use residual control. We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw. (Video1)

[1]  Jessy W. Grizzle,et al.  From 2D Design of Underactuated Bipedal Gaits to 3D Implementation: Walking With Speed Tracking , 2016, IEEE Access.

[2]  Sehoon Ha,et al.  Learning Fast Adaptation With Meta Strategy Optimization , 2020, IEEE Robotics and Automation Letters.

[3]  Koushil Sreenath,et al.  Dynamic Walking on Randomly-Varying Discrete Terrain with One-step Preview , 2017, Robotics: Science and Systems.

[4]  Michiel van de Panne,et al.  Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real , 2019, CoRL.

[5]  Jie Tan,et al.  Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, RSS 2020.

[6]  Akshara Rai,et al.  Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Twan Koolen,et al.  Summary of Team IHMC's virtual robotics challenge entry , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[8]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Russ Tedrake,et al.  Whole-body motion planning with centroidal dynamics and full kinematics , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[10]  Jessy W. Grizzle,et al.  Rapid Trajectory optimization Using C-FROST with Illustration on a Cassie-Series Dynamic Walking Biped , 2018, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[12]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[13]  Glen Berseth,et al.  Feedback Control For Cassie With Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  C. Karen Liu,et al.  Sim-to-Real Transfer for Biped Locomotion , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Aaron D. Ames,et al.  Coupling Reduced Order Models via Feedback Control for 3D Underactuated Bipedal Robotic Walking , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[16]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[17]  Koushil Sreenath,et al.  Animated Cassie: A Dynamic Relatable Robotic Character , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Twan Koolen,et al.  Capturability-based analysis and control of legged locomotion, Part 2: Application to M2V2, a lower-body humanoid , 2012, Int. J. Robotics Res..

[19]  Scott Kuindersma,et al.  An efficiently solvable quadratic program for stabilizing dynamic locomotion , 2013, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Jessy W. Grizzle,et al.  Feedback Control of a Cassie Bipedal Robot: Walking, Standing, and Riding a Segway , 2018, 2019 American Control Conference (ACC).

[21]  Weiwei Huang,et al.  3D walking based on online optimization , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[22]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[23]  Alfred A. Rizzi,et al.  Gaits and gait transitions for legged robots , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[24]  Jun Morimoto,et al.  Learning Biped Locomotion , 2007, IEEE Robotics & Automation Magazine.

[25]  Sergey Levine,et al.  Residual Reinforcement Learning for Robot Control , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[26]  Twan Koolen,et al.  Capturability-based analysis and control of legged locomotion, Part 1: Theory and application to three simple gait models , 2011, Int. J. Robotics Res..

[27]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[28]  Christine Chevallereau,et al.  3D Bipedal Robotic Walking: Models, Feedback Control, and Open Problems , 2010 .

[29]  Siddhartha S. Srinivasa,et al.  Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[30]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Bowen Weng,et al.  Hybrid Zero Dynamics Inspired Feedback Control Policy Design for 3D Bipedal Locomotion using Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Miomir Vukobratovic,et al.  Zero-Moment Point - Thirty Five Years of its Life , 2004, Int. J. Humanoid Robotics.

[33]  Bowen Weng,et al.  Velocity Regulation of 3D Bipedal Walking Robots with Uncertain Dynamics Through Adaptive Neural Network Controller , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[35]  Alan Fern,et al.  Learning Memory-Based Control for Human-Scale Bipedal Locomotion , 2020, Robotics: Science and Systems.

[36]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[37]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.