论文信息 - ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation

ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation

Many Reinforcement Learning (RL) approaches use joint control signals (positions, velocities, torques) as action space for continuous control tasks. We propose to lift the action space to a higher level in the form of subgoals for a motion generator (a combination of motion planner and trajectory executor). We argue that, by lifting the action space and by leveraging sampling-based motion planners, we can efficiently use RL to solve complex, long-horizon tasks that could not be solved with existing RL methods in the original action space. We propose ReLMoGen -- a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals. To validate our method, we apply ReLMoGen to two types of tasks: 1) Interactive Navigation tasks, navigation problems where interactions with the environment are required to reach the destination, and 2) Mobile Manipulation tasks, manipulation tasks that require moving the robot base. These problems are challenging because they are usually long-horizon, hard to explore during training, and comprise alternating phases of navigation and interaction. Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments. In all settings, ReLMoGen outperforms state-of-the-art Reinforcement Learning and Hierarchical Reinforcement Learning baselines. ReLMoGen also shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.

[1] Mohi Khansari,et al. RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Jitendra Malik,et al. Combining Optimal Control and Learning for Visual Navigation in Novel Environments , 2019, CoRL.

[3] Sergey Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.

[4] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[5] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[6] Dieter Fox,et al. Neural Autonomous Navigation with Riemannian Motion Policy , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[7] Sen Wang,et al. Learning Mobile Manipulation through Deep Reinforcement Learning , 2020, Sensors.

[8] Gaurav S. Sukhatme,et al. Motion Planner Augmented Action Spaces for Reinforcement Learning , 2020 .

[9] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.

[10] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[11] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[12] Szymon Rusinkiewicz,et al. Spatial Action Maps for Mobile Manipulation , 2020, Robotics: Science and Systems.

[13] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[14] Alejandro Hernández Cordero,et al. Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and Gazebo , 2016, ArXiv.

[15] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.

[16] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[17] Xiaogang Wang,et al. Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[19] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[20] Steven M. LaValle,et al. RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[21] Sergey Levine,et al. Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[22] Silvio Savarese,et al. Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments , 2020, IEEE Robotics and Automation Letters.

[23] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[24] Alberto Rodriguez,et al. TossingBot: Learning to Throw Arbitrary Objects With Residual Physics , 2019, IEEE Transactions on Robotics.

[25] Scott Kuindersma,et al. Autonomous Skill Acquisition on a Mobile Manipulator , 2011, AAAI.

[26] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[27] Ashutosh Saxena,et al. Learning to open new doors , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28] Silvio Savarese,et al. HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators , 2019, CoRL.

[29] Xinyu Liu,et al. Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[30] Marc Toussaint,et al. Trajectory prediction in cluttered voxel environments , 2010, 2010 IEEE International Conference on Robotics and Automation.

[31] Aviv Tamar,et al. Harnessing Reinforcement Learning for Neural Motion Planning , 2019, Robotics: Science and Systems.

[32] Siddhartha S. Srinivasa,et al. Learning from Experience in Manipulation Planning: Setting the Right Goals , 2011, ISRR.

[33] Fangkai Yang,et al. Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots , 2018, ArXiv.

[34] Pieter Abbeel,et al. rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch , 2019, ArXiv.

[35] Michael C. Yip,et al. Motion Planning Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[36] Sergey Levine,et al. Guided Policy Search , 2013, ICML.

[37] Lydia E. Kavraki,et al. Path planning using lazy PRM , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[38] Steven M. LaValle,et al. Planning algorithms , 2006 .

[39] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[40] Jitendra Malik,et al. Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Atil Iscen,et al. Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[42] Byron Boots,et al. Towards Robust Skill Generalization: Unifying Learning from Demonstration and Motion Planning , 2017, CoRL.

[43] Bernard Ghanem,et al. Driving Policy Transfer via Modularity and Abstraction , 2018, CoRL.

[44] Vladlen Koltun,et al. Beauty and the Beast: Optimal Methods Meet Learning for Drone Racing , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[45] Jitendra Malik,et al. Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[46] Dmitry Berenson,et al. An optimization approach to planning for mobile manipulation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[47] Oussama Khatib,et al. Springer Handbook of Robotics , 2007, Springer Handbooks.

[48] Alberto Rodriguez,et al. Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[49] Yevgen Chebotar,et al. Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[50] Sergey Levine,et al. Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[51] Yi Zhang,et al. A novel mobile robot navigation method based on deep reinforcement learning , 2020 .

[52] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..

[53] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.

[54] Dieter Fox,et al. BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators , 2019, Robotics: Science and Systems.

[55] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[56] Jitendra Malik,et al. On Evaluation of Embodied Navigation Agents , 2018, ArXiv.