Learning Vision-Guided Dynamic Locomotion Over Challenging Terrains

Abstract: Legged robots are becoming increasingly powerful and popular in recent years for their potential to bring the mobility of autonomous agents to the next level. This work presents a deep reinforcement learning approach that learns a robust Lidar-based perceptual locomotion policy in a partially observable environment using Proximal Policy Optimisation. Visual perception is critical to actively overcome challenging terrains, and to do so, we propose a novel learning strategy: Dynamic Reward Strategy (DRS), which serves as effective heuristics to learn a versatile gait using a neural network architecture without the need to access the history data. Moreover, in a modified version of the OpenAI gym environment, the proposed work is evaluated with scores over 90% success rate in all tested challenging terrains.

[1]  Jitendra Malik,et al.  RMA: Rapid Motor Adaptation for Legged Robots , 2021, Robotics: Science and Systems.

[2]  Sethu Vijayakumar,et al.  Online Optimal Impedance Planning for Legged Robots , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[4]  Davide Scaramuzza,et al.  Learning High-Level Policies for Model Predictive Control , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[6]  Ganyu Deng,et al.  CNNs based Foothold Selection for Energy-Efficient Quadruped Locomotion over Rough Terrains , 2019, 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[7]  Marco Hutter,et al.  Imitation Learning from MPC for Quadrupedal Multi-Gait Control , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Kevin Blankespoor,et al.  BigDog, the Rough-Terrain Quadruped Robot , 2008 .

[9]  Darwin G. Caldwell,et al.  Vision enhanced reactive locomotion control for trotting on rough terrain , 2013, 2013 IEEE Conference on Technologies for Practical Robot Applications (TePRA).

[10]  Sethu Vijayakumar,et al.  Robust Footstep Planning and LQR Control for Dynamic Quadrupedal Locomotion , 2020, IEEE Robotics and Automation Letters.

[11]  Koushil Sreenath,et al.  Deep visual perception for dynamic walking on discrete terrain , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[12]  Chuanyu Yang,et al.  Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on Rough Terrain Challenge , 2017, 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[13]  Alberto Bemporad,et al.  Mobility-enhanced MPC for Legged Locomotion on Rough Terrain , 2021, ArXiv.

[14]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[15]  Atil Iscen,et al.  Zero-Shot Terrain Generalization for Visual Locomotion Policies , 2020, ArXiv.

[16]  Gerardo Bledt,et al.  Extracting Legged Locomotion Heuristics with Regularized Predictive Control , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Jonathan W. Hurst,et al.  Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie , 2019, ArXiv.

[18]  Peter Stone,et al.  Skeletal Feature Compensation for Imitation Learning with Embodiment Mismatch , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[19]  Dong Jin Hyun,et al.  Implementation of trot-to-gallop transition and subsequent gallop on the MIT Cheetah I , 2016, Int. J. Robotics Res..

[20]  Taku Komura,et al.  Learning Whole-Body Motor Skills for Humanoids , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[21]  Qiuguo Zhu,et al.  Multi-expert learning of adaptive legged locomotion , 2020, Science Robotics.

[22]  Nikolaos G. Tsagarakis,et al.  Compliance control for stabilizing the humanoid on the changing slope based on terrain inclination estimation , 2016, Auton. Robots.

[23]  Sangbae Kim,et al.  Dynamic Locomotion in the MIT Cheetah 3 Through Convex Model-Predictive Control , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Joonho Lee,et al.  DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2020, IEEE Robotics and Automation Letters.

[25]  Ioannis Havoutis,et al.  Real-Time Trajectory Adaptation for Quadrupedal Locomotion using Deep Reinforcement Learning , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Sergey Levine,et al.  Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.