论文信息 - Multi-Robot Deep Reinforcement Learning for Mobile Navigation

Multi-Robot Deep Reinforcement Learning for Mobile Navigation

Deep reinforcement learning algorithms require large and diverse datasets in order to learn successful policies for perception-based mobile navigation. However, gathering such datasets with a single robot can be prohibitively expensive. Collecting data with multiple different robotic platforms with possibly different dynamics is a more scalable approach to large-scale data collection. But how can deep reinforcement learning algorithms leverage such heterogeneous datasets? In this work, we propose a deep reinforcement learning algorithm with hierarchically integrated models (HInt). At training time, HInt learns separate perception and dynamics models, and at test time, HInt integrates the two models in a hierarchical manner and plans actions with the integrated model. This method of planning with hierarchically integrated models allows the algorithm to train on datasets gathered by a variety of different platforms, while respecting the physical capabilities of the deployment robot at test time. Our mobile navigation experiments show that HInt outperforms conventional hierarchical policies and single-source approaches.

[1] R. Rubinstein. The Cross-Entropy Method for Combinatorial and Continuous Optimization , 1999 .

[2] Mark R. Mine,et al. The Panda3D Graphics Engine , 2004, Computer.

[3] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[4] Martial Hebert,et al. Learning monocular reactive UAV control in cluttered natural environments , 2012, 2013 IEEE International Conference on Robotics and Automation.

[5] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[6] Evangelos Theodorou,et al. Model Predictive Path Integral Control using Covariance Variable Importance Sampling , 2015, ArXiv.

[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8] Michael Milford,et al. Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies , 2016, ICRA 2017.

[9] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[10] Wei Gao,et al. Intention-Net: Integrating Planning and Deep Learning for Goal-Directed Autonomous Navigation , 2017, CoRL.

[11] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[12] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[13] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[14] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[15] Peter Stone,et al. Generative Adversarial Imitation from Observation , 2018, ArXiv.

[16] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[17] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[18] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[19] Sergey Levine,et al. Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21] Bernard Ghanem,et al. Driving Policy Transfer via Modularity and Abstraction , 2018, CoRL.

[22] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[23] Carlos R. del-Blanco,et al. DroNet: Learning to Fly by Driving , 2018, IEEE Robotics and Automation Letters.

[24] Dieter Fox,et al. Neural Autonomous Navigation with Riemannian Motion Policy , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[25] Yannick Schroecker,et al. Imitating Latent Policies from Observation , 2018, ICML.

[26] Aleksandra Faust,et al. Learning Navigation Behaviors End-to-End With AutoRL , 2018, IEEE Robotics and Automation Letters.

[27] Byron Boots,et al. Provably Efficient Imitation Learning from Observation Alone , 2019, ICML.

[28] Jitendra Malik,et al. Combining Optimal Control and Learning for Visual Navigation in Novel Environments , 2019, CoRL.

[29] Vladlen Koltun,et al. Beauty and the Beast: Optimal Methods Meet Learning for Drone Racing , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[30] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.

[31] David Janz,et al. Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[32] Wolfram Burgard,et al. VR-Goggles for Robots: Real-to-Sim Domain Adaptation for Visual Control , 2018, IEEE Robotics and Automation Letters.

[33] S. Levine,et al. RoboNet: Large-Scale Multi-Robot Learning , 2019, Conference on Robot Learning.

[34] Pieter Abbeel,et al. Hierarchically Decoupled Imitation for Morphological Transfer , 2020, ICML.

[35] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.

[36] Nathan D. Ratliff,et al. Scaling Local Control to Large-Scale Topological Navigation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[37] Tsuyoshi Murata,et al. {m , 1934, ACML.

[38] S. Levine,et al. BADGR: An Autonomous Self-Supervised Learning-Based Navigation System , 2020, IEEE Robotics and Automation Letters.