Learning Kinematic Feasibility for Mobile Manipulation through Deep Reinforcement Learning

Mobile manipulation tasks remain one of the critical challenges for the widespread adoption of autonomous robots in both service and industrial scenarios. While planning approaches are good at generating feasible whole-body robot trajectories, they struggle with dynamic environments as well as the incorporation of constraints given by the task and the environment. On the other hand, dynamic motion models in the action space struggle with generating kinematically feasible trajectories for mobile manipulation actions. We propose a deep reinforcement learning approach to learn feasible dynamic motions for a mobile base while the end-effector follows a trajectory in task space generated by an arbitrary system to fulfill the task at hand. This modular formulation has several benefits: it enables us to readily transform a broad range of end-effector motions into mobile applications, it allows us to use the kinematic feasibility of the end-effector trajectory as a dense reward signal and its modular formulation allows it to generalise to unseen end-effector motions at test time. We demonstrate the capabilities of our approach on multiple mobile robot platforms with different kinematic abilities and different types of wheeled platforms in extensive simulated as well as real-world experiments.

[1]  Daniel Leidner,et al.  Object-centered hybrid reasoning for whole-body mobile manipulation , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[3]  Jörg Stückler,et al.  Mobile Manipulation, Tool Use, and Intuitive Interaction for Cognitive Service Robot Cosero , 2016, Front. Robot. AI.

[4]  Andreas Zell,et al.  Inverse Recurrent Models - An Application Scenario for Many-Joint Robot Arm Control , 2016, ICANN.

[5]  Tamim Asfour,et al.  Robot placement based on reachability inversion , 2013, 2013 IEEE International Conference on Robotics and Automation.

[6]  W. Burgard,et al.  Convoluted Mixture of Deep Experts for Robust Semantic Segmentation , 2016 .

[7]  Wolfram Burgard,et al.  BI2RRT*: An efficient sampling-based path planning framework for task-constrained mobile manipulation , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Tamim Asfour,et al.  A combined approach for robot placement and coverage path planning for mobile manipulation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Wolfram Burgard,et al.  Coupling Mobile Base and End-Effector Motion in Task Space , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Silvio Savarese,et al.  HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators , 2019, CoRL.

[11]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[12]  Wolfram Burgard,et al.  Learning mobile manipulation actions from human demonstrations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Tom Schaul,et al.  Universal Value Function Approximators , 2015, ICML.

[14]  Leslie Pack Kaelbling,et al.  Learning to Achieve Goals , 1993, IJCAI.

[15]  Jin Huang,et al.  A Reinforcement Learning Approach for Inverse Kinematics of Arm Robot , 2019, Proceedings of the 2019 4th International Conference on Robotics, Control and Automation.

[16]  Sen Wang,et al.  Learning Mobile Manipulation through Deep Reinforcement Learning , 2020, Sensors.

[17]  Maren Bennewitz,et al.  Whole-body motion planning for manipulation of articulated objects , 2013, 2013 IEEE International Conference on Robotics and Automation.

[18]  Silvio Savarese,et al.  ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation , 2020, ArXiv.

[19]  Doina Precup,et al.  The Option-Critic Architecture , 2016, AAAI.

[20]  Leslie Pack Kaelbling,et al.  Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.

[21]  Roland Siegwart,et al.  Go Fetch: Mobile Manipulation in Unstructured Environments , 2020, ArXiv.

[22]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[23]  Martial Hebert,et al.  An integrated system for autonomous robotics manipulation , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[25]  Roland Siegwart,et al.  Whole-Body Control of a Mobile Manipulator using End-to-End Reinforcement Learning , 2020, ArXiv.