Place Recognition in Semi-Dense Maps: Geometric and Learning-Based Approaches.

For robotics and augmented reality systems operating in large and dynamic environments, place recognition and tracking using vision represent very challenging tasks. Additionally, when these systems need to reliably operate for very long time periods, such as months or years, further challenges are introduced by severe environmental changes, that can significantly alter the visual appearance of a scene. Thus, to unlock long term, large scale visual place recognition, it is necessary to develop new methodologies for improving localization under difficult conditions. As shown in previous work, gains in robustness can be achieved by exploiting the 3D structural information of a scene. The latter, extracted from image sequences, carries in fact more discriminative clues than individual images only. In this paper, we propose to represent a scene’s structure with semi-dense point clouds, due to their highly informative power, and the simplicity of their generation through mature visual odometry and SLAM systems. Then we cast place recognition as an instance of pose retrieval and evaluate several techniques, including recent learning based approaches, to produce discriminative descriptors of semi-dense point clouds. Our proposed methodology, evaluated on the recently published and challenging Oxford Robotcar Dataset, shows to outperform image-based place recognition, with improvements up to 30% in precision across strong appearance changes. To the best of our knowledge, we are the first to propose place recognition in semi-dense maps.

[1]  Nico Blodow,et al.  Aligning point cloud views using persistent feature histograms , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Michael Isard,et al.  Descriptor Learning for Efficient Retrieval , 2010, ECCV.

[3]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[4]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Renaud Dubé,et al.  Structure-based vision-laser matching , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[7]  Guang-Zhong Yang,et al.  Generative Methods for Long-Term Place Recognition in Dynamic Scenes , 2013, International Journal of Computer Vision.

[8]  Yann LeCun,et al.  Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[9]  Davide Scaramuzza,et al.  EMVS: Event-based Multi-View Stereo , 2016, BMVC.

[10]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Roland Siegwart,et al.  Efficient descriptor learning for large scale localization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Hongbin Zha,et al.  Coarse-to-fine vision-based localization by indexing scale-Invariant features , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[14]  Davide Scaramuzza,et al.  EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time , 2017, IEEE Robotics and Automation Letters.

[15]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[19]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Matthias Nießner,et al.  3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Michael Bosse,et al.  Placeless Place-Recognition , 2014, 2014 2nd International Conference on 3D Vision.

[22]  Julien Pilet,et al.  Size Matters: Exhaustive Geometric Verification for Image Retrieval Accepted for ECCV 2012 , 2012, ECCV.

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[24]  Abel Gawel,et al.  Point cloud descriptors for place recognition using sparse visual information , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[26]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[27]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Simon Lacroix,et al.  Location graphs for visual place recognition , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Torsten Sattler,et al.  Comparative Evaluation of Hand-Crafted and Learned Local Features , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[31]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[32]  Li He,et al.  M2DP: A novel 3D point cloud descriptor and its application in loop closure detection , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Michael Milford,et al.  Convolutional Neural Network-based Place Recognition , 2014, ICRA 2014.

[34]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Paul Newman,et al.  FAB-MAP 3D: Topological mapping with spatial and visual appearance , 2010, 2010 IEEE International Conference on Robotics and Automation.

[36]  Michael Bosse,et al.  Place recognition using keypoint voting in large 3D lidar datasets , 2013, 2013 IEEE International Conference on Robotics and Automation.

[37]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[39]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..