Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators

Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench.

[1]  Jonas Buchli,et al.  Efficient kinematic planning for mobile manipulators with non-holonomic constraints using optimal control , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Marco Hutter,et al.  Per-Contact Iteration Method for Solving Contact Dynamics , 2018, IEEE Robotics and Automation Letters.

[3]  Joonho Lee,et al.  DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2020, IEEE Robotics and Automation Letters.

[4]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[5]  Marco Hutter,et al.  ALMA - Articulated Locomotion and Manipulation for a Torque-Controllable Robot , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[6]  Abel Gawel,et al.  A Fully-Integrated Sensing and Control System for High-Accuracy Mobile Robotic Building Construction , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[8]  Russ Tedrake,et al.  Whole-body motion planning with centroidal dynamics and full kinematics , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[9]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[10]  Chelsea Finn,et al.  Meta-Learning without Memorization , 2020, ICLR.

[11]  Jonas Buchli,et al.  Sequential Linear Quadratic Optimal Control for Nonlinear Switched Systems , 2016, ArXiv.

[12]  Peter Fankhauser,et al.  ANYmal - a highly mobile and dynamic quadrupedal robot , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Abhinav Gupta,et al.  Robust Adversarial Reinforcement Learning , 2017, ICML.

[14]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[15]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[16]  Marco Hutter,et al.  A Unified MPC Framework for Whole-Body Dynamic Locomotion and Manipulation , 2021, IEEE Robotics and Automation Letters.

[17]  Xingye Da,et al.  GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model , 2021, ArXiv.

[18]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.