Fast and Efficient Locomotion via Learned Gait Transitions

We focus on the problem of developing efficient controllers for quadrupedal robots. Animals can actively switch gaits at different speeds to lower their energy consumption. In this paper, we devise a hierarchical learning framework, in which distinctive locomotion gaits and natural gait transitions emerge automatically with a simple reward of energy minimization. We use reinforcement learning to train a high-level gait policy that specifies the contact schedules of each foot, while the low-level Model Predictive Controller (MPC) optimizes the motor torques so that the robot can walk at a desired velocity using that gait pattern. We test our learning framework on a quadruped robot and demonstrate automatic gait transitions, from walking to trotting and to fly-trotting, as the robot increases its speed up to 2.5m/s (5 body lengths/s). We show that the learned hierarchical controller consumes much less energy across a wide range of locomotion speed than baseline controllers.

[1]  Hartmut Witte,et al.  Comparing the effect of different spine and leg designs for a small bounding quadruped robot , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[2]  V Radhakrishnan,et al.  Locomotion: dealing with friction. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Christopher G. Atkeson,et al.  Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[4]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[5]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[6]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[7]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[8]  Sangbae Kim,et al.  Dynamic Locomotion in the MIT Cheetah 3 Through Convex Model-Predictive Control , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Michiel van de Panne,et al.  Learning Locomotion Skills for Cassie: Iterative Design and Sim-to-Real , 2019, CoRL.

[10]  Atil Iscen,et al.  From Pixels to Legs: Hierarchical Learning of Quadruped Locomotion , 2020, CoRL.

[11]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[12]  Dong Jin Hyun,et al.  Implementation of trot-to-gallop transition and subsequent gallop on the MIT Cheetah I , 2016, Int. J. Robotics Res..

[13]  Zoran Popovic,et al.  Contact-invariant optimization for hand manipulation , 2012, SCA '12.

[14]  M H Raibert,et al.  Trotting, pacing and bounding by a quadruped robot. , 1990, Journal of biomechanics.

[15]  Byron Boots,et al.  Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion , 2020, Conference on Robot Learning.

[16]  Alan Fern,et al.  Learning Task Space Actions for Bipedal Locomotion , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Edo Jelavic,et al.  Real-time motion planning of legged robots: A model predictive control approach , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[18]  Alan Fern,et al.  Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning , 2021, Robotics: Science and Systems.

[19]  R. Alexander,et al.  A dynamic similarity hypothesis for the gaits of quadrupedal mammals , 2009 .

[20]  Byron Boots,et al.  Simulation-based design of dynamic controllers for humanoid balancing , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  R. McN. Alexander,et al.  The Gaits of Bipedal and Quadrupedal Animals , 1984 .

[22]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[23]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[24]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[25]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[26]  Yuval Tassa,et al.  Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[27]  Carlos Mastalli,et al.  Passive Whole-Body Control for Quadruped Robots: Experimental Validation Over Challenging Terrain , 2018, IEEE Robotics and Automation Letters.

[28]  Peter Fankhauser,et al.  Dynamic locomotion and whole-body control for quadrupedal robots , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29]  Jie Tan,et al.  Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, RSS 2020.

[30]  Atil Iscen,et al.  Data Efficient Reinforcement Learning for Legged Robots , 2019, CoRL.

[31]  Donghyun Kim,et al.  Highly Dynamic Quadruped Locomotion via Whole-Body Impulse Control and Model Predictive Control , 2019, ArXiv.

[32]  Akio Ishiguro,et al.  A Quadruped Robot Exhibiting Spontaneous Gait Transitions from Walking to Trotting to Galloping , 2017, Scientific Reports.

[33]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[34]  Russ Tedrake,et al.  A direct method for trajectory optimization of rigid bodies through contact , 2014, Int. J. Robotics Res..

[35]  Doina Precup,et al.  Temporal abstraction in reinforcement learning , 2000, ICML 2000.

[36]  Taku Komura,et al.  Phase-functioned neural networks for character control , 2017, ACM Trans. Graph..

[37]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[38]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[39]  Vikash Kumar,et al.  Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real , 2019, CoRL.

[40]  Sangbae Kim,et al.  Online Gait Transitions and Disturbance Recovery for Legged Robots via the Feasible Impulse Set , 2019, IEEE Robotics and Automation Letters.

[41]  Chelsea Finn,et al.  Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[42]  C. Karen Liu,et al.  Articulated swimming creatures , 2011, ACM Trans. Graph..

[43]  Scott Kuindersma,et al.  Variational Contact-Implicit Trajectory Optimization , 2017, ISRR.

[44]  J. Andrew Bagnell,et al.  Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective , 2019, AISTATS.

[45]  Auke Jan Ijspeert,et al.  Model predictive control based framework for CoM control of a quadruped robot , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[46]  Marc H. Raibert,et al.  Legged Robots That Balance , 1986, IEEE Expert.

[47]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[48]  Sehoon Ha,et al.  Learning Fast Adaptation With Meta Strategy Optimization , 2020, IEEE Robotics and Automation Letters.

[49]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[50]  David E. Orin,et al.  Intelligent control of quadruped gallops , 2003 .

[51]  Akshara Rai,et al.  Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[52]  Ruben Grandia,et al.  Feedback MPC for Torque-Controlled Legged Robots , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[53]  KangKang Yin,et al.  SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..

[54]  D. F. Hoyt,et al.  Gait and the energetics of locomotion in horses , 1981, Nature.

[55]  Atil Iscen,et al.  Policies Modulating Trajectory Generators , 2018, CoRL.