论文信息 - Deep-Geometric 6 DoF Localization from a Single Image in Topo-metric Maps

Deep-Geometric 6 DoF Localization from a Single Image in Topo-metric Maps

We describe a Deep-Geometric Localizer that is able to estimate the full 6 Degree of Freedom (DoF) global pose of the camera from a single image in a previously mapped environment. Our map is a topo-metric one, with discrete topological nodes whose 6 DoF poses are known. Each topo-node in our map also comprises of a set of points, whose 2D features and 3D locations are stored as part of the mapping process. For the mapping phase, we utilise a stereo camera and a regular stereo visual SLAM pipeline. During the localization phase, we take a single camera image, localize it to a topological node using Deep Learning, and use a geometric algorithm (PnP) on the matched 2D features (and their 3D positions in the topo map) to determine the full 6 DoF globally consistent pose of the camera. Our method divorces the mapping and the localization algorithms and sensors (stereo and mono), and allows accurate 6 DoF pose estimation in a previously mapped environment using a single camera. With potential VR/AR and localization applications in single camera devices such as mobile phones and drones, our hybrid algorithm compares favourably with the fully Deep-Learning based Pose-Net that regresses pose from a single image in simulated as well as real environments.

[1] Wolfram Burgard,et al. G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[2] Alan M. Zhang,et al. Robust appearance based visual route following in large scale outdoor environments , 2007 .

[3] Paul Newman,et al. 1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[4] Takeo Kanade,et al. Visual topometric localization , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[5] Richard Szeliski,et al. Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6] David G. Lowe,et al. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[7] Tomás Pajdla,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Michel Barlaud,et al. Fast k nearest neighbor search using GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[9] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10] Germán Ros,et al. CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[11] Roberto Cipolla,et al. Geometric Loss Functions for Camera Pose Regression with Deep Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] J. M. M. Montiel,et al. ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[13] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .

[14] Roland Siegwart,et al. Hybrid simultaneous localization and map building: a natural integration of topological and metric , 2003, Robotics Auton. Syst..

[15] Dongbing Gu,et al. UnDeepVO: Monocular Visual Odometry Through Unsupervised Deep Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16] Roberto Cipolla,et al. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17] Kenneth Levenberg. A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[18] Peter I. Corke,et al. Vision-only autonomous navigation using topometric maps , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19] Ana Cristina Murillo,et al. SURF features for efficient robot localization with omnidirectional images , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.