论文信息 - DTC: Deep Tracking Control - A Unifying Approach to Model-Based Planning and Reinforcement-Learning for Versatile and Robust Locomotion

DTC: Deep Tracking Control - A Unifying Approach to Model-Based Planning and Reinforcement-Learning for Versatile and Robust Locomotion

Legged locomotion is a complex control problem that requires both accuracy and robustness to cope with real-world challenges. Legged systems have traditionally been controlled using trajectory optimization with inverse dynamics. Such hierarchical model-based methods are appealing due to intuitive cost function tuning, accurate planning, and most importantly, the insightful understanding gained from more than one decade of extensive research. However, model mismatch and violation of assumptions are common sources of faulty operation and may hinder successful sim-to-real transfer. Simulation-based reinforcement learning, on the other hand, results in locomotion policies with unprecedented robustness and recovery skills. Yet, all learning algorithms struggle with sparse rewards emerging from environments where valid footholds are rare, such as gaps or stepping stones. In this work, we propose a hybrid control architecture that combines the advantages of both worlds to simultaneously achieve greater robustness, foot-placement accuracy, and terrain generalization. Our approach utilizes a model-based planner to roll out a reference motion during training. A deep neural network policy is trained in simulation, aiming to track the optimized footholds. We evaluate the accuracy of our locomotion pipeline on sparse terrains, where pure data-driven methods are prone to fail. Furthermore, we demonstrate superior robustness in the presence of slippery or deformable ground when compared to model-based counterparts. Finally, we show that our proposed tracking controller generalizes across different trajectory optimization methods not seen during training. In conclusion, our work unites the predictive capabilities and optimality guarantees of online planning with the inherent robustness attributed to offline learning.

Farbod Farshidian | Fabian Jenelten | Junzhe He | Marco Hutter

[1] Vincent Vanhoucke,et al. Barkour: Benchmarking Animal-level Agility with Quadruped Robots , 2023, ArXiv.

[2] Jitendra Malik,et al. Legged Locomotion in Challenging Terrains using Egocentric Vision , 2022, CoRL.

[3] M. Hutter,et al. Advanced Skills by Learning Locomotion and Local Navigation End-to-End , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4] M. Hutter,et al. Perceptive Locomotion Through Nonlinear Model-Predictive Control , 2022, IEEE Transactions on Robotics.

[5] M. Hutter,et al. TAMOLS: Terrain-Aware Motion Optimization for Legged Systems , 2022, IEEE Transactions on Robotics.

[6] M. Hutter,et al. Elevation Mapping for Locomotion and Navigation using GPU , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7] R. Hadsell,et al. Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors , 2022, ArXiv.

[8] Kevin R. Green,et al. Sim-to-Real Learning of Footstep-Constrained Bipedal Dynamic Walking , 2022, 2022 International Conference on Robotics and Automation (ICRA).

[9] S. Vijayakumar,et al. Agile Maneuvers in Legged Robots: a Predictive Control Approach , 2022, ArXiv.

[10] Lorenz Wellhausen,et al. Learning robust perceptive locomotion for quadrupedal robots in the wild , 2022, Science Robotics.

[11] Steven Bohez,et al. Learning Coordinated Terrain-Adaptive Locomotion by Imitating a Centroidal Dynamics Planner , 2021, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12] Philipp Reist,et al. Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning , 2021, CoRL.

[13] Ludovic Righetti,et al. Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization , 2021, Frontiers in Robotics and AI.

[14] M. V. D. Panne,et al. GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model , 2021, WAFR.

[15] David Surovik,et al. Receding-Horizon Perceptive Trajectory Optimization for Dynamic Legged Locomotion with Learned Initialization , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[16] M. Fallon,et al. RLOC: Terrain-Aware Legged Locomotion Using Reinforcement Learning and Optimal Control , 2020, IEEE Transactions on Robotics.

[17] Lorenz Wellhausen,et al. Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[18] Marco Hutter,et al. Perceptive Locomotion in Rough Terrain – Online Foothold Optimization , 2020, IEEE Robotics and Automation Letters.

[19] Hung Yu Ling,et al. ALLSTEPS: Curriculum‐driven Learning of Stepping Stone Skills , 2020, Comput. Graph. Forum.

[20] Darwin G. Caldwell,et al. Motion Planning for Quadrupedal Locomotion: Coupled Planning, Terrain Mapping, and Whole-Body Control , 2020, IEEE Transactions on Robotics.

[21] David Surovik,et al. Reliable Trajectories for Dynamic Quadrupeds using Analytical Costs and Learned Initializations , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[22] Joonho Lee,et al. DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2019, IEEE Robotics and Automation Letters.

[23] Robert J. Griffin,et al. Footstep Planning for Autonomous Walking Over Rough Terrain , 2019, 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids).

[24] Joonho Lee,et al. Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[25] Peter Fankhauser,et al. Robust Rough-Terrain Locomotion with a Quadrupedal Robot , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[26] Marco Hutter,et al. Gait and Trajectory Optimization for Legged Systems Through Phase-Based End-Effector Parameterization , 2018, IEEE Robotics and Automation Letters.

[27] Marco Hutter,et al. Dynamic Locomotion Through Online Nonlinear Motion Optimization for Quadrupedal Robots , 2018, IEEE Robotics and Automation Letters.

[28] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[29] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..

[30] Stefan Schaal,et al. Fast, robust quadruped locomotion over challenging terrain , 2010, 2010 IEEE International Conference on Robotics and Automation.

[31] Andrew Y. Ng,et al. A control architecture for quadruped locomotion over rough terrain , 2008, 2008 IEEE International Conference on Robotics and Automation.

[32] Sehoon Ha,et al. Visual-Locomotion: Learning to Walk on Complex Terrains with Vision , 2021, CoRL.

[33] M. Fallon,et al. Learning an Expert Skill-Space for Replanning Dynamic Quadruped Locomotion over Obstacles , 2020, CoRL.