Visual Memory for Robust Path Following

Humans routinely retrace a path in a novel environment both forwards and backwards despite uncertainty in their motion. In this paper, we present an approach for doing so. Given a demonstration of a path, a first network generates an abstraction of the path. Equipped with this abstraction, a second network then observes the world and decides how to act in order to retrace the path under noisy actuation and a changing environment. The two networks are optimized end-to-end at training time. We evaluate the method in two realistic simulators, performing path following both forwards and backwards. Our experiments show that our approach outperforms both a classical approach to solving this task as well as a number of other baselines.

[1]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[2]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[3]  Vladlen Koltun,et al.  Semi-parametric Topological Memory for Navigation , 2018, ICLR.

[4]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Sean L. Bowman,et al.  Probabilistic data association for semantic SLAM , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[7]  John J. Leonard,et al.  Towards visual ego-motion learning in robots , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[9]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Ruslan Salakhutdinov,et al.  Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.

[11]  Timothy D. Barfoot,et al.  Robust Monocular Visual Teach and Repeat Aided by Local Ground Planarity and Color‐constant Imagery , 2017, J. Field Robotics.

[12]  Martial Hebert,et al.  Learning Transferable Policies for Monocular Reactive MAV Control , 2016, ISER.

[13]  Jürgen Schmidhuber,et al.  A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.

[14]  David W. Murray,et al.  Mobile Robot Localisation Using Active Vision , 1998, ECCV.

[15]  Vijay Kumar,et al.  Memory Augmented Control Networks , 2017, ICLR.

[16]  Steven M. LaValle,et al.  Rapidly-Exploring Random Trees: Progress and Prospects , 2000 .

[17]  Ruslan Salakhutdinov,et al.  Active Neural Localization , 2018, ICLR.

[18]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[19]  Abhinav Gupta,et al.  Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Silvio Savarese,et al.  3D Semantic Parsing of Large-Scale Indoor Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jitendra Malik,et al.  On Evaluation of Embodied Navigation Agents , 2018, ArXiv.

[22]  Leslie Pack Kaelbling,et al.  Provably safe robot navigation with obstacle uncertainty , 2017, Robotics: Science and Systems.

[23]  José Ruíz Ascencio,et al.  Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[24]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[25]  Jan Kautz,et al.  Geometry-Aware Learning of Maps for Camera Localization , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[27]  Torsten Sattler,et al.  Semantic Visual Localization , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Wolfram Burgard,et al.  Neural SLAM: Learning to Explore with External Memory , 2017, 1706.09520.

[29]  Raia Hadsell,et al.  Learning to Navigate in Cities Without a Map , 2018, NeurIPS.

[30]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Jitendra Malik,et al.  Learning a Multi-View Stereo Machine , 2017, NIPS.

[32]  Lydia E. Kavraki,et al.  Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[33]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[34]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Sergey Levine,et al.  Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Jitendra Malik,et al.  Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Rahul Sukthankar,et al.  Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.

[38]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[39]  Ramesh Raskar,et al.  Deep Visual Teach and Repeat on Path Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[40]  Matthias Nießner,et al.  Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[41]  John Canny,et al.  The complexity of robot motion planning , 1988 .

[42]  Paul Timothy Furgale,et al.  Visual teach and repeat for long‐range rover autonomy , 2010, J. Field Robotics.

[43]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).