Learning quadrupedal locomotion over challenging terrain

A learning-based locomotion controller enables a quadrupedal ANYmal robot to traverse challenging natural environments. Legged locomotion can extend the operational domain of robots to some of the most challenging environments on Earth. However, conventional controllers for legged locomotion are based on elaborate state machines that explicitly trigger the execution of motion primitives and reflexes. These designs have increased in complexity but fallen short of the generality and robustness of animal locomotion. Here, we present a robust controller for blind quadrupedal locomotion in challenging natural environments. Our approach incorporates proprioceptive feedback in locomotion control and demonstrates zero-shot generalization from simulation to natural environments. The controller is trained by reinforcement learning in simulation. The controller is driven by a neural network policy that acts on a stream of proprioceptive signals. The controller retains its robustness under conditions that were never encountered during training: deformable terrains such as mud and snow, dynamic footholds such as rubble, and overground impediments such as thick vegetation and gushing water. The presented work indicates that robust locomotion in natural environments can be achieved by training in simple domains.

[1]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2]  Marco Hutter,et al.  Per-Contact Iteration Method for Solving Contact Dynamics , 2018, IEEE Robotics and Automation Letters.

[3]  Roland Siegwart,et al.  State Estimation for Legged Robots - Consistent Fusion of Leg Kinematics and IMU , 2012, Robotics: Science and Systems.

[4]  Darwin G. Caldwell,et al.  A reactive controller framework for quadrupedal locomotion on challenging terrain , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  R. Siegwart,et al.  ROBOT-CENTRIC ELEVATION MAPPING WITH UNCERTAINTY ESTIMATES , 2014 .

[6]  Kenneth O. Stanley,et al.  Revising the evolutionary computation abstraction: minimal criteria novelty search , 2010, GECCO '10.

[7]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[8]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[9]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[10]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[12]  Rafael Bidarra,et al.  A Survey of Procedural Methods for Terrain Modelling , 2009 .

[13]  Sergey Levine,et al.  Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.

[14]  Darwin G. Caldwell,et al.  Heuristic Planning for Rough Terrain Locomotion in Presence of External Disturbances and Variable Perception Quality , 2018, ECHORD++.

[15]  John Schulman,et al.  Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Matthew M. Williamson,et al.  Series elastic actuators , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[17]  Russ Tedrake,et al.  Efficient Bipedal Robots Based on Passive-Dynamic Walkers , 2005, Science.

[18]  Ares Lagae,et al.  A Survey of Procedural Noise Functions , 2010, Comput. Graph. Forum.

[19]  Maani Ghaffari Jadidi,et al.  Legged Robot State-Estimation Through Combined Forward Kinematic and Preintegrated Contact Factors , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  R. McNeill Alexander,et al.  Principles of Animal Locomotion , 2002 .

[21]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[22]  Jessy W. Grizzle,et al.  Feedback Control of a Cassie Bipedal Robot: Walking, Standing, and Riding a Segway , 2018, 2019 American Control Conference (ACC).

[23]  Kenneth O. Stanley,et al.  Minimal criterion coevolution: a new approach to open-ended search , 2017, GECCO.

[24]  Joonho Lee,et al.  Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning , 2019, ArXiv.

[25]  Sangbae Kim,et al.  Contact Model Fusion for Event-Based Locomotion in Unstructured Terrains , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Vladlen Koltun,et al.  Learning by Cheating , 2019, CoRL.

[27]  Sergey Levine,et al.  Learning to Walk in the Real World with Minimal Human Effort , 2020, CoRL.

[28]  Roland Siegwart,et al.  State estimation for legged robots on unstable and slippery terrain , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[30]  Marco Hutter,et al.  Dynamic Locomotion on Slippery Ground , 2019, IEEE Robotics and Automation Letters.

[31]  Atil Iscen,et al.  Policies Modulating Trajectory Generators , 2018, CoRL.

[32]  Pieter Abbeel,et al.  Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.

[33]  Peter Fankhauser,et al.  Probabilistic foot contact estimation by fusing information from dynamics and differential/forward kinematics , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[35]  Peter Fankhauser,et al.  ANYmal - a highly mobile and dynamic quadrupedal robot , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Marco Hutter,et al.  Dynamic Locomotion Through Online Nonlinear Motion Optimization for Quadrupedal Robots , 2018, IEEE Robotics and Automation Letters.

[37]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[38]  Roland Siegwart,et al.  Dynamic trotting on slopes for quadrupedal robots , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[39]  Jie Tan,et al.  Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, RSS 2020.

[40]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[41]  Michiel van de Panne,et al.  Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real , 2019, CoRL.

[42]  Rui Wang,et al.  Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.

[43]  Alessandro Giusti,et al.  Learning Ground Traversability From Simulations , 2017, IEEE Robotics and Automation Letters.

[44]  Atil Iscen,et al.  Data Efficient Reinforcement Learning for Legged Robots , 2019, CoRL.

[45]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[46]  Darwin G. Caldwell,et al.  Slip Detection and Recovery for Quadruped Robots , 2015, ISRR.

[47]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[48]  Darwin G. Caldwell,et al.  Probabilistic Contact Estimation and Impact Detection for State Estimation of Quadruped Robots , 2017, IEEE Robotics and Automation Letters.

[49]  Aaron D. Ames,et al.  Dynamic Walking with Compliance on a Cassie Bipedal Robot , 2019, 2019 18th European Control Conference (ECC).