Learning to Jump from Pixels

Today’s robotic quadruped systems can robustly walk over a diverse range of rough but continuous terrains, where the terrain elevation varies gradually. Locomotion on discontinuous terrains, such as those with gaps or obstacles, presents a complementary set of challenges. In discontinuous settings, it becomes necessary to plan ahead using visual inputs and to execute agile behaviors beyond robust walking, such as jumps. Such dynamic motion results in significant motion of onboard sensors, which introduces a new set of challenges for real-time visual processing. The requirement for agility and terrain awareness in this setting reinforces the need for robust control. We present Depth-based Impulse Control (DIC), a method for synthesizing highly agile visually-guided locomotion behaviors. DIC affords the flexibility of model-free learning but regularizes behavior through explicit model-based optimization of ground reaction forces. We evaluate the proposed method both in simulation and in the real world1.

[1]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[2]  Alan Fern,et al.  Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning , 2021, Robotics: Science and Systems.

[3]  David E. Orin,et al.  Evolution of dynamic maneuvers in a 3D galloping quadruped robot , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[4]  Darwin G. Caldwell,et al.  Fast and Continuous Foothold Adaptation for Dynamic Locomotion Through CNNs , 2018, IEEE Robotics and Automation Letters.

[5]  Stefan Schaal,et al.  Learning, planning, and control for quadruped locomotion over challenging terrain , 2011, Int. J. Robotics Res..

[6]  Jitendra Malik,et al.  Minimizing Energy Consumption Leads to the Emergence of Gaits in Legged Robots , 2021, CoRL.

[7]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[8]  D. Kim,et al.  Vision Aided Dynamic Exploration of Unstructured Terrain with a Small-Scale Quadruped Robot , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Sangbae Kim,et al.  High-speed bounding with the MIT Cheetah 2: Control design and experiments , 2017, Int. J. Robotics Res..

[10]  David E. Orin,et al.  Control of a quadruped standing jump over irregular terrain obstacles , 1995, Auton. Robots.

[11]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[12]  Aaron D. Ames,et al.  Multi-Layered Safety for Legged Robots via Control Barrier Functions and Model Predictive Control , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[14]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[15]  Andrew Y. Ng,et al.  A control architecture for quadruped locomotion over rough terrain , 2008, 2008 IEEE International Conference on Robotics and Automation.

[16]  Joonho Lee,et al.  DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2020, IEEE Robotics and Automation Letters.

[17]  Donghyun Kim,et al.  Highly Dynamic Quadruped Locomotion via Whole-Body Impulse Control and Model Predictive Control , 2019, ArXiv.

[18]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[19]  Peter Fankhauser,et al.  Perception-less terrain adaptation through whole body control and hierarchical optimization , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[20]  Jitendra Malik,et al.  RMA: Rapid Motor Adaptation for Legged Robots , 2021, Robotics: Science and Systems.

[21]  Vladlen Koltun,et al.  Learning by Cheating , 2019, CoRL.

[22]  Stefan Schaal,et al.  Quadratic programming for inverse dynamics with optimal distribution of contact forces , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[23]  Byron Boots,et al.  Fast and Efficient Locomotion via Learned Gait Transitions , 2021, CoRL.

[24]  Donghyun Kim,et al.  Computationally-Robust and Efficient Prioritized Whole-Body Controller with Contact Constraints , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Atil Iscen,et al.  Learning Agile Locomotion Skills with a Mentor , 2020, ArXiv.

[26]  Byron Boots,et al.  Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion , 2020, Conference on Robot Learning.

[27]  Peter Fankhauser,et al.  Robust Rough-Terrain Locomotion with a Quadrupedal Robot , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Atil Iscen,et al.  From Pixels to Legs: Hierarchical Learning of Quadruped Locomotion , 2020, CoRL.

[29]  Michiel van de Panne,et al.  ALLSTEPS: Curriculum‐driven Learning of Stepping Stone Skills , 2020, Comput. Graph. Forum.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Sangbae Kim,et al.  Online Planning for Autonomous Running Jumps Over Obstacles in High-Speed Quadrupeds , 2015, Robotics: Science and Systems.

[32]  Xingye Da,et al.  GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model , 2021, ArXiv.

[33]  Atil Iscen,et al.  Policies Modulating Trajectory Generators , 2018, CoRL.

[34]  Sangbae Kim,et al.  Mini Cheetah: A Platform for Pushing the Limits of Dynamic Quadruped Control , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[35]  Yuval Tassa,et al.  Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.

[36]  Guillaume Bellegarda,et al.  Robust Quadruped Jumping via Deep Reinforcement Learning , 2020, ArXiv.

[37]  Ioannis Havoutis,et al.  RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal Control , 2020, ArXiv.