论文信息 - Loop Closure Detection Via Maximization of Mutual Information

Loop Closure Detection Via Maximization of Mutual Information

An image can be described in terms of appearance frequency of visual words. This representation is implemented in bag-of-visual-words (BoVW)-based loop closure detection for its efficiency and effectiveness. However, traditional BoVW-based approaches are strongly affected by false positive loops due to scene ambiguity caused by redundant words in the vocabulary and fail to detect bidirectional loops in monocular mode. Aiming at overcoming these problems, we propose a novel vocabulary construction algorithm named hierarchical sequential information bottleneck (HsIB) by leveraging the maximization of mutual information (MMI) mechanism. First, feature descriptors are extracted from training images for visual vocabulary construction. Second, HsIB extracts discriminative yet informative visual words through the MMI mechanism in vocabulary construction, which treats feature descriptors clustering as a process of data compression. Finally, the clustering process reaches a tradeoff between compactness and discrimination and improves the performance of traditional BoVW-based loop closure detection. The proposed method is compared with state-of-the-art methods on publicly available datasets. We also create a challenging dataset to further evaluate the performance of HsIB on bidirectional loops. To the best of our knowledge, we are the first to implement information bottleneck (IB) method in visual-SLAM (vSLAM) loop closure detection, and we obtain impressive results.

Yangdong Ye | Ge Zhang | Xiaoqiang Yan

[1] Shiri Gordon,et al. Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2] Peter I. Corke,et al. Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[3] Alberto Ortiz,et al. iBoW-LCD: An Appearance-Based Loop-Closure Detection Approach Using Incremental Bags of Binary Words , 2018, IEEE Robotics and Automation Letters.

[4] Luis Miguel Bergasa,et al. Bidirectional loop closure detection on panoramas for visual navigation , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[5] Zongben Xu,et al. Sparse K-Means with ℓ∞/ℓ0 Penalty for High-Dimensional Data Clustering , 2014, ArXiv.

[6] Dacheng Tao,et al. Large-Margin Multi-ViewInformation Bottleneck , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] W. Burgard,et al. RAWSEEDS: Robotics Advancement through Web-publishing of Sensorial and Elaborated Extensive Data Sets , 2010 .

[8] Yangdong Ye,et al. Multi-task Clustering of Human Actions by Sharing Information , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Antonios Gasteratos,et al. Deep learning features exception for cross-season visual place recognition , 2017, Pattern Recognit. Lett..

[10] Stefan B. Williams,et al. Reduced SIFT Features For Image Retrieval And Indoor Localisation , 2004 .

[11] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12] Niko Sünderhauf,et al. On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13] Gary R. Bradski,et al. ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[14] Tom Drummond,et al. Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[15] Yangdong Ye,et al. The Multi-Feature Information Bottleneck with Application to Unsupervised Image Categorization , 2013, IJCAI.

[16] Dorian Gálvez-López,et al. Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[17] Gholam Ali Montazer,et al. Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier , 2017, Neural Processing Letters.

[18] Naftali Tishby,et al. Unsupervised document classification using sequential information maximization , 2002, SIGIR '02.

[19] Ian D. Reid,et al. RSLAM: A System for Large-Scale Mapping in Constant-Time Using Stereo , 2011, International Journal of Computer Vision.

[20] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22] Yonina C. Eldar,et al. Using mutual information for designing the measurement matrix in phase retrieval problems , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[23] Luc Van Gool,et al. SURF: Speeded Up Robust Features , 2006, ECCV.

[24] Tao Zhang,et al. Semi-direct monocular visual and visual-inertial SLAM with loop closure detection , 2019, Robotics Auton. Syst..

[25] Paul Newman,et al. FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[26] Luis Miguel Bergasa,et al. Fusion and binarization of CNN features for robust topological localization across seasons , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27] Margarita Chli,et al. Real-Time Wide-Baseline Place Recognition Using Depth Completion , 2019, IEEE Robotics and Automation Letters.

[28] Avinash C. Kak,et al. Building 3D visual maps of interior space with a new hierarchical sensor fusion architecture , 2013, Robotics Auton. Syst..

[29] Paul Newman,et al. Appearance-only SLAM at large scale with FAB-MAP 2.0 , 2011, Int. J. Robotics Res..

[30] Michael K. Ng,et al. An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[31] Francisco Angel Moreno,et al. A collection of outdoor robotic datasets with centimeter-accuracy ground truth , 2009, Auton. Robots.

[32] Olivier Stasse,et al. MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[34] Qi Tian,et al. Scalable Feature Matching by Dual Cascaded Scalar Quantization for Image Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] Mubarak Shah,et al. Scene Modeling Using Co-Clustering , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[36] Vincent Lepetit,et al. BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[37] Lei Wang,et al. A Fast Approximate AIB Algorithm for Distributional Word Clustering , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38] Hua Wang,et al. Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM , 2018, Autonomous Robots.

[39] Daniel Cremers,et al. LDSO: Direct Sparse Odometry with Loop Closure , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[40] Shih-Fu Chang,et al. Video search reranking via information bottleneck principle , 2006, MM '06.

[41] Yangdong Ye,et al. Unsupervised video categorization based on multivariate information bottleneck method , 2015, Knowl. Based Syst..

[42] Hui Yu,et al. Shared-Private Information Bottleneck Method for Cross-Modal Clustering , 2019, IEEE Access.

[43] Tao Zhang,et al. Unsupervised learning to detect loops using deep neural networks for visual SLAM system , 2017, Auton. Robots.

[44] Gholam Ali Montazer,et al. A new image feature descriptor for content based image retrieval using scale invariant feature transform and local derivative pattern , 2017 .

[45] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[46] Sanjay Kumar Singh,et al. Multimodal Retrieval using Mutual Information based Textual Query Reformulation , 2017, Expert Syst. Appl..

[47] Donald A. Adjeroh,et al. Information Bottleneck Learning Using Privileged Information for Visual Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Yangdong Ye,et al. Unsupervised Human Action Categorization with Consensus Information Bottleneck Method , 2016, IJCAI.

[49] Noam Slonim,et al. The Information Bottleneck : Theory and Applications , 2006 .

[50] Paul H. J. Kelly,et al. SLAM++: Simultaneous Localisation and Mapping at the Level of Objects , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[51] Yangdong Ye,et al. Incorporating side information into multivariate Information Bottleneck for generating alternative clusterings , 2015, Pattern Recognit. Lett..

[52] Roland Siegwart,et al. The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[53] Michael Milford,et al. Multi-Process Fusion: Visual Place Recognition Using Multiple Image Processing Methods , 2019, IEEE Robotics and Automation Letters.

[54] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[55] Naftali Tishby,et al. Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[56] Ning Liu,et al. Visual Loop Closure Detection with Scene Mutual Information for Mobile Robot , 2014, J. Comput..

[57] Wolfram Burgard,et al. 3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.