Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion

We present a hierarchical framework that combines model-based control and reinforcement learning (RL) to synthesize robust controllers for a quadruped (the Unitree Laikago). The system consists of a high-level controller that learns to choose from a set of primitives in response to changes in the environment and a low-level controller that utilizes an established control method to robustly execute the primitives. Our framework learns a controller that can adapt to challenging environmental changes on the fly, including novel scenarios not seen during training. The learned controller is up to 85~percent more energy efficient and is more robust compared to baseline methods. We also deploy the controller on a physical robot without any randomization or adaptation scheme.

[1]  S. Levine,et al.  Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, Robotics: Science and Systems.

[2]  Oliver Kroemer,et al.  In-Hand Object Pose Tracking via Contact Feedback and GPU-Accelerated Robotic Simulation , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Alexander Mitchell,et al.  Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion , 2020, IEEE Robotics and Automation Letters.

[4]  Jeannette Bohg,et al.  Learning Hierarchical Control for Robust In-Hand Manipulation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Sehoon Ha,et al.  Learning Fast Adaptation With Meta Strategy Optimization , 2019, IEEE Robotics and Automation Letters.

[6]  Joonho Lee,et al.  DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2019, IEEE Robotics and Automation Letters.

[7]  S. Vijayakumar,et al.  Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Vikash Kumar,et al.  Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real , 2019, CoRL.

[9]  Atil Iscen,et al.  Hierarchical Reinforcement Learning for Quadruped Locomotion , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Sangbae Kim,et al.  Online Gait Transitions and Disturbance Recovery for Legged Robots via the Feasible Impulse Set , 2019, IEEE Robotics and Automation Letters.

[11]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[12]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[13]  Michiel van de Panne,et al.  Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real , 2019, CoRL.

[14]  Dieter Fox,et al.  GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning , 2018, CoRL.

[15]  Sangbae Kim,et al.  Dynamic Locomotion in the MIT Cheetah 3 Through Convex Model-Predictive Control , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Sergey Levine,et al.  Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[17]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[18]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[19]  Marco Hutter,et al.  Gait and Trajectory Optimization for Legged Systems Through Phase-Based End-Effector Parameterization , 2018, IEEE Robotics and Automation Letters.

[20]  Edo Jelavic,et al.  Real-time motion planning of legged robots: A model predictive control approach , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[21]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[22]  Scott Kuindersma,et al.  Variational Contact-Implicit Trajectory Optimization , 2017, ISRR.

[23]  Gaurav S. Sukhatme,et al.  Learning to Switch Between Sensorimotor Primitives Using Multimodal Haptic Signals , 2016, SAB.

[24]  Glen Berseth,et al.  Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[25]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[26]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[27]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[28]  Russ Tedrake,et al.  A direct method for trajectory optimization of rigid bodies through contact , 2014, Int. J. Robotics Res..

[29]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[30]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[31]  Doina Precup,et al.  Temporal abstraction in reinforcement learning , 2000, ICML 2000.

[32]  D. F. Hoyt,et al.  Gait and the energetics of locomotion in horses , 1981, Nature.