Learning whom to trust in navigation: dynamically switching between classical and neural planning

Navigation of terrestrial robots is typically addressed either with localization and mapping (SLAM) followed by classical planning on the dynamically created maps, or by machine learning (ML), often through end-to-end training with reinforcement learning (RL) or imitation learning (IL). Recently, modular designs have achieved promising results, and hybrid algorithms that combine ML with classical planning have been proposed. Existing methods implement these combinations with hand-crafted functions, which cannot fully exploit the complementary nature of the policies and the complex regularities between scene structure and planning performance. Our work builds on the hypothesis that the strengths and weaknesses of neural planners and classical planners follow some regularities, which can be learned from training data, in particular from interactions. This is grounded on the assumption that, both, trained planners and the mapping algorithms underlying classical planning are subject to failure cases depending on the semantics of the scene and that this dependence is learnable: for instance, certain areas, objects or scene structures can be reconstructed easier than others. We propose a hierarchical method composed of a high-level planner dynamically switching between a classical and a neural planner. We fully train all neural policies in simulation and evaluate the method in both simulation and real experiments with a LoCoBot robot, showing significant gains in performance, in particular in the real environment. We also qualitatively conjecture on the nature of data regularities exploited by the high-level planner.

[1]  G. Monaci,et al.  Learning with a Mole: Transferable latent spatial representations for navigation without reconstruction , 2023, arXiv.org.

[2]  A. Baskurt,et al.  Multi-Object Navigation in real environments using hybrid policies , 2023, IEEE International Conference on Robotics and Automation.

[3]  Olivier Simonin,et al.  Multi-Object Navigation with dynamically learned neural implicit representations , 2022, ArXiv.

[4]  Dhruv Batra,et al.  Is Mapping Necessary for Realistic PointGoal Navigation? , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Sergio Gomez Colmenarejo,et al.  A Generalist Agent , 2022, Trans. Mach. Learn. Res..

[6]  S. Levine,et al.  ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints , 2022, Robotics: Science and Systems.

[7]  C. Schmid,et al.  Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Olivier Simonin,et al.  Graph augmented Deep Reinforcement Learning in the GameRLand3D environment , 2021, ArXiv.

[9]  Jan Kautz,et al.  Learning Continuous Environment Fields via Implicit Functions , 2021, ICLR.

[10]  David D. Fan,et al.  Hybrid Imitative Planning with Geometric and Predictive Costs in Off-road Environments , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[11]  Jens Lambrecht,et al.  All-in-One: A DRL-based Control Switch Combining State-of-the-art Navigation Planners , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[12]  Angel X. Chang,et al.  Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI , 2021, NeurIPS Datasets and Benchmarks.

[13]  Olivier Simonin,et al.  Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation , 2021, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Roozbeh Mottaghi,et al.  RobustNav: Towards Benchmarking Robustness in Embodied Navigation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Xin Yu,et al.  VTNet: Visual Transformer Network for Object Goal Navigation , 2021, ICLR.

[16]  Angel X. Chang,et al.  MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation , 2020, NeurIPS.

[17]  Asako Kanezaki,et al.  Path Planning using Neural A* Search , 2020, ICML.

[18]  Olivier Simonin,et al.  Learning to plan with uncertain topological maps , 2020, ECCV.

[19]  Ruslan Salakhutdinov,et al.  Object Goal Navigation using Goal-Oriented Semantic Exploration , 2020, NeurIPS.

[20]  Ruslan Salakhutdinov,et al.  Neural Topological SLAM for Visual Navigation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Ruslan Salakhutdinov,et al.  Learning to Explore using Active Neural SLAM , 2020, ICLR.

[22]  Ruffin White,et al.  The Marathon 2: A Navigation System , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[23]  Sonia Chernova,et al.  Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance? , 2019, IEEE Robotics and Automation Letters.

[24]  Ali Farhadi,et al.  Visual Reaction: Learning to Play Catch With Your Drone , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Olivier Simonin,et al.  Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer , 2019, 2020 25th International Conference on Pattern Recognition (ICPR).

[26]  Jitendra Malik,et al.  Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Silvio Savarese,et al.  Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jitendra Malik,et al.  Combining Optimal Control and Learning for Visual Navigation in Novel Environments , 2019, CoRL.

[29]  François Michaud,et al.  RTAB‐Map as an open‐source lidar and visual simultaneous localization and mapping library for large‐scale and long‐term online operation , 2018, J. Field Robotics.

[30]  Jitendra Malik,et al.  On Evaluation of Embodied Navigation Agents , 2018, ArXiv.

[31]  Andrea Vedaldi,et al.  MapNet: An Allocentric Spatial Memory for Mapping Environments , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Jitendra Malik,et al.  Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  V. Koltun,et al.  Semi-parametric Topological Memory for Navigation , 2018, ICLR.

[34]  Ali Farhadi,et al.  AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.

[35]  Lydia Tapia,et al.  PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Matthias Nießner,et al.  Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[37]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[38]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[39]  Rahul Sukthankar,et al.  Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.

[40]  Pieter Abbeel,et al.  Value Iteration Networks , 2016, NIPS.

[41]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Torsten Bertram,et al.  Timed-Elastic-Bands for time-optimal point-to-point nonlinear model predictive control , 2015, 2015 European Control Conference (ECC).

[43]  Sebastian Thrun,et al.  Unsupervised Intrinsic Calibration of Depth Sensors via SLAM , 2013, Robotics: Science and Systems.

[44]  Kurt Konolige,et al.  The Office Marathon: Robust navigation in an indoor office environment , 2010, 2010 IEEE International Conference on Robotics and Automation.

[45]  Kurt Konolige,et al.  A gradient method for realtime robot control , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[46]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[47]  Wolfram Burgard,et al.  The Interactive Museum Tour-Guide Robot , 1998, AAAI/IAAI.

[48]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[49]  J A Sethian,et al.  A fast marching level set method for monotonically advancing fronts. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[51]  A. Sathyamoorthy,et al.  Sim-to-Real Strategy for Spatially Aware Robot Navigation in Uneven Outdoor Environments , 2022, ArXiv.