Metric learning with generator for closed loop detection in VSLAM

The development of Driverless Car, Unmanned Aerial Vehicle, Human–Computer Interaction and Artificial Intelligence has promoted the Internet of Things (IoT) industry, in which, Visual Simultaneous Localization and Mapping (VSLAM) is an important Localization and Mapping technique. Closed loop detection can alleviate the error accumulation during the operation of VSLAM. The traditional closed loop detection methods mostly rely on manually defined features, subjective and unstable, which are difficult to cope with complex and repetitive scenarios. Thus, triplet loss-based metric learning has been considered as a better solution for closed loop detection. In this paper, first, constructed Generator is applied to generate feature vector of hard negative sample. Second, triplet loss and generative loss have been applied to construct loss function. The keyframes are converted into feature vectors with well-trained model, evaluating the similarity of keyframes by calculating their distance of feature vectors, which is used to determine whether a closed loop is formed. Finally, TUM dataset is introduced to evaluate the Precision and Recall of the proposed metric learning. The well-trained model is applied to establish loop closing thread for VSLAM system. The experimental results illustrate the feasibility and effectiveness of the metric learning-based closed loop detection, which can be further applied to practical VSLAM systems.

[1]  Xudong Lin,et al.  Deep Adversarial Metric Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Vincent Lepetit,et al.  Keypoint recognition using randomized trees , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  U-Xuan Tan,et al.  Collaborative SLAM Based on WiFi Fingerprint Similarity and Motion Information , 2019, IEEE Internet of Things Journal.

[4]  Shoaib Ehsan,et al.  A Holistic Visual Place Recognition Approach Using Lightweight CNNs for Significant ViewPoint and Appearance Changes , 2020, IEEE Transactions on Robotics.

[5]  Shaogang Gong,et al.  Person Re-Identification by Deep Joint Learning of Multi-Loss Classification , 2017, IJCAI.

[6]  Antonios Gasteratos,et al.  Learning spatially semantic representations for cognitive robot navigation , 2013, Robotics Auton. Syst..

[7]  Peter I. Corke,et al.  All-environment visual place recognition with SMART , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Tao Zhang,et al.  Unsupervised learning to detect loops using deep neural networks for visual SLAM system , 2017, Auton. Robots.

[9]  Tao Xiang,et al.  Deep Transfer Learning for Person Re-Identification , 2016, 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM).

[10]  Dong Ye,et al.  Compressed Holistic ConvNet Representations for Detecting Loop Closures in Dynamic Environments , 2020, IEEE Access.

[11]  Yangdong Ye,et al.  Loop Closure Detection Via Maximization of Mutual Information , 2019, IEEE Access.

[12]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[13]  Guangming Xiong,et al.  Multi-Sensors Based Simultaneous Mapping and Global Pose Optimization , 2019, 2019 IEEE International Conference on Unmanned Systems (ICUS).

[14]  Javier Gonzalez-Jimenez,et al.  PL-SLAM: A Stereo SLAM System Through the Combination of Points and Line Segments , 2017, IEEE Transactions on Robotics.

[15]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[16]  Avinash C. Kak,et al.  Building 3D visual maps of interior space with a new hierarchical sensor fusion architecture , 2013, Robotics Auton. Syst..

[17]  Olivier Stasse,et al.  SLAM and Vision-based Humanoid Navigation , 2018, Humanoid Robotics: A Reference.

[18]  Michael Rubenstein,et al.  CPL-SLAM: Efficient and Certifiably Correct Planar Graph-Based SLAM Using the Complex Number Representation , 2020, IEEE Transactions on Robotics.

[19]  David J. Fleet,et al.  VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.

[20]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[21]  Gordon Wyeth,et al.  CAT-SLAM: probabilistic localisation and mapping using a continuous appearance-based trajectory , 2012, Int. J. Robotics Res..

[22]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[23]  Feng Ruan,et al.  Deep learning for real-time image steganalysis: a survey , 2019, Journal of Real-Time Image Processing.

[24]  Shilin Zhou,et al.  Convolutional neural network-based image representation for visual loop closure detection , 2015, 2015 IEEE International Conference on Information and Automation.

[25]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Jiwen Lu,et al.  Deep Adversarial Metric Learning , 2020, IEEE Transactions on Image Processing.

[27]  Shengcai Liao,et al.  Embedding Deep Metric for Person Re-identification: A Study Against Large Variations , 2016, ECCV.

[28]  David Filliat,et al.  A visual bag of words method for interactive qualitative localization and mapping , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[29]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[31]  Jiangming Kan,et al.  A Novel Loop Closure Detection Method Using Line Features , 2019, IEEE Access.

[32]  Kaiqi Huang,et al.  A Multi-Task Deep Network for Person Re-Identification , 2016, AAAI.

[33]  Niko Sünderhauf,et al.  On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Gordon Wyeth,et al.  OpenFABMAP: An open source toolbox for appearance-based loop closure detection , 2012, 2012 IEEE International Conference on Robotics and Automation.

[35]  Ryan M. Eustice,et al.  Characterizing the Uncertainty of Jointly Distributed Poses in the Lie Algebra , 2019, IEEE Transactions on Robotics.

[36]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[37]  Hesheng Wang,et al.  Loop closure detection using supervised and unsupervised deep neural networks for monocular SLAM systems , 2020, Robotics Auton. Syst..

[38]  Tao Zhang,et al.  Loop closure detection for visual SLAM systems using deep neural networks , 2015, 2015 34th Chinese Control Conference (CCC).

[39]  Yun Pan,et al.  Gridding place recognition for fast loop closure detection on mobile platforms , 2019, Electronics Letters.

[40]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Yong Guan,et al.  Manifold Regularization Graph Structure Auto-Encoder to Detect Loop Closure for Visual SLAM , 2019, IEEE Access.