Assigning Visual Words to Places for Loop Closure Detection

Place recognition of pre-visited areas, widely known as Loop Closure Detection (LCD), constitutes one of the most important components in robotic applications, where the robot needs to estimate its pose while navigating through the field (e.g., simultaneous localization and mapping). In this paper, we present a novel approach for LCD based on the assignment of Visual Words (VWs) to particular places of the traversed path. The system operates in real time and does not require any pre-training procedure, such as visual vocabulary construction or descriptor-space dimensionality reduction. A place is defined through a dynamic segmentation of the incoming image stream and is assigned with VWs through the usage of an on-line clustering algorithm. At query time, image descriptors are converted into VWs on the map accumulating votes to the corresponding places. By means of a probability function, the mechanism is capable of identifying a loop closing candidate place. A nearest neighbor voting scheme on the descriptors' space allows the system to select the most appropriate image match at the chosen place. Geometrical and temporal consistency checks are applied on the proposed loop closing pair increasing the system's performance. Evaluation took place on several publicly available and challenging datasets offering high precision and recall scores as compared to other state-of-the-art approaches.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[3]  Paul Newman,et al.  Appearance-only SLAM at large scale with FAB-MAP 2.0 , 2011, Int. J. Robotics Res..

[4]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[5]  Jannik Fritsch,et al.  A new performance measure and evaluation benchmark for road detection algorithms , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[6]  Winston Churchill,et al.  The New College Vision and Laser Data Set , 2009, Int. J. Robotics Res..

[7]  Michael Bosse,et al.  Placeless Place-Recognition , 2014, 2014 2nd International Conference on 3D Vision.

[8]  Stefano Soatto,et al.  A Simple Hierarchical Pooling Data Structure for Loop Closure , 2015, ECCV.

[9]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[10]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[11]  Rafael García,et al.  Automatic Visual Bag-of-Words for Online Robot Navigation and Mapping , 2012, IEEE Transactions on Robotics.

[12]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[13]  Paul Newman,et al.  Loop closure detection in SLAM by combining visual and spatial appearance , 2006, Robotics Auton. Syst..

[14]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[15]  Luis Miguel Bergasa,et al.  Fusion and binarization of CNN features for robust topological localization across seasons , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Kai Ma,et al.  Enhancing Place Recognition Using Joint Intensity - Depth Analysis and Synthetic Data , 2016, ECCV Workshops.

[17]  Antonios Gasteratos,et al.  Encoding the description of image sequences: A two-layered pipeline for loop closure detection , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[19]  Cyrill Stachniss,et al.  Simultaneous Localization and Mapping , 2016, Springer Handbook of Robotics, 2nd Ed..

[20]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[21]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[22]  Francisco Angel Moreno,et al.  A collection of outdoor robotic datasets with centimeter-accuracy ground truth , 2009, Auton. Robots.

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[24]  Jean-Arcady Meyer,et al.  Fast and Incremental Method for Loop-Closure Detection Using Bags of Visual Words , 2008, IEEE Transactions on Robotics.

[25]  Roland Siegwart,et al.  Visual Place Recognition with Probabilistic Vertex Voting , 2016, ArXiv.

[26]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.