Double-Domain Adaptation Semantics for Retrieval-Based Long-Term Visual Localization

Due to seasonal and illumination variance, long-term visual localization tasks in dynamic environments is a crucial problem in the field of autonomous driving and robotics. At present, image-based retrieval is an effective method to solve this problem. However, it is difficult to completely distinguish changes in the same location over times by relying on content information alone. In order to solve these above problems, a double-domain network model combining semantic information and content information is proposed for visual localization task. In addition, this approach only needs to use the virtual KITTI 2 dataset for training. To reduce the domain difference between real scene and virtual image, the cross-predictive semantic segmentation mechanism is introduced to solve this problem. In addition, the obtained model achieves good domain adaptation and further has well generalization on other real datasets by introducing a domain loss function and a triplet semantic loss function. A series of experiments on the Extended CMU-Seasons dataset and the Oxford RobotCar-Seasons dataset demonstrates that the proposed network model outperformes the state-of-the-art baselines for retrieval-based visual localization in challenging environments.

[1]  Fuchao Wu,et al.  Learning Semantic-Aware Local Features for Long Term Visual Localization , 2022, IEEE Transactions on Image Processing.

[2]  G. Csurka,et al.  Investigating the Role of Image Retrieval for Visual Localization , 2022, International Journal of Computer Vision.

[3]  Konstantinos A. Tsintotas,et al.  The Revisiting Problem in Simultaneous Localization and Mapping: A Survey on Visual Loop Closure Detection , 2022, IEEE Transactions on Intelligent Transportation Systems.

[4]  Hesheng Wang,et al.  SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark Under Multiple Environments , 2020, 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Lei Wang,et al.  Visual place recognition: A survey from deep learning perspective , 2020, Pattern Recognit..

[6]  Torsten Sattler,et al.  Long-Term Visual Localization Revisited , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Hesheng Wang,et al.  DASGIL: Domain Adaptation for Semantic and Geometric-Aware Image-Based Localization , 2020, IEEE Transactions on Image Processing.

[8]  Hesheng Wang,et al.  Domain-Invariant Similarity Activation Map Contrastive Learning for Retrieval-Based Long-Term Visual Localization , 2020, IEEE/CAA Journal of Automatica Sinica.

[9]  Valérie Gouet-Brunet,et al.  Improving Image Description with Auxiliary Modality for Visual Localization in Challenging Conditions , 2020, International Journal of Computer Vision.

[10]  Wei Yang,et al.  DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Carlos Campos,et al.  ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM , 2020, IEEE Transactions on Robotics.

[12]  Yubin Kuang,et al.  Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Peng Gao,et al.  Long-term Place Recognition through Worst-case Graph Matching to Integrate Landmark Appearances and Spatial Relationships , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Peer Neubert,et al.  Unsupervised Learning Methods for Visual Place Recognition in Discretely and Continuously Changing Environments , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[15]  M. Geist,et al.  Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Hesheng Wang,et al.  Retrieval-based Localization Based on Domain-invariant Feature Learning under Changing Environments , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Torsten Sattler,et al.  Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Rongrong Ji,et al.  Semi-Supervised Adversarial Monocular Depth Estimation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Peter Xiaoping Liu,et al.  A mixed-depth visual rendering method for bleeding simulation , 2019, IEEE/CAA Journal of Automatica Sinica.

[20]  Desire Sidibé,et al.  Learning Scene Geometry for Visual Localization in Challenging Conditions , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[21]  Shin-Dug Kim,et al.  Real-Time Visual–Inertial SLAM Based on Adaptive Keyframe Selection for Mobile AR Applications , 2019, IEEE Transactions on Multimedia.

[22]  Tao Lu,et al.  Localizing Discriminative Visual Landmarks for Place Recognition , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[23]  Torsten Sattler,et al.  Understanding the Limitations of CNN-Based Absolute Camera Pose Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Torsten Sattler,et al.  A Cross-Season Correspondence Dataset for Robust Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Tianzhu Zhang,et al.  Deep Multi-Modality Adversarial Networks for Unsupervised Domain Adaptation , 2019, IEEE Transactions on Multimedia.

[26]  Yuqing He,et al.  A Multi-Domain Feature Learning Method for Visual Place Recognition , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[27]  Roland Siegwart,et al.  From Coarse to Fine: Robust Hierarchical Localization at Large Scale , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Luc Van Gool,et al.  Night-to-Day Image Translation for Retrieval-based Localization , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[29]  Gabriel J. Brostow,et al.  Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Rashid Ansari,et al.  Improved Image-Based Localization Using SFM and Modified Coordinate System Transfer , 2018, IEEE Transactions on Multimedia.

[31]  William P. Maddern,et al.  Adversarial Training for Adverse Conditions: Robust Metric Localisation using Appearance Transfer , 2018, ICRA.

[32]  R. Venkatesh Babu,et al.  AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Luc Van Gool,et al.  ComboGAN: Unrestrained Scalability for Image Domain Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Swami Sankaranarayanan,et al.  Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[36]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, International Journal of Computer Vision.

[38]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  T. Pajdla,et al.  24/7 place recognition by view synthesis , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Aaron C. Courville,et al.  Generative adversarial networks , 2014, Commun. ACM.

[43]  Philip S. Yu,et al.  Transfer Sparse Coding for Robust Image Representation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  C. Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  M. George Understanding. , 1998, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[46]  Jiwen Lu,et al.  Seeing Through Darkness: Visual Localization at Night via Weakly Supervised Learning of Domain Invariant Features , 2023, IEEE Transactions on Multimedia.

[47]  Qianru Sun,et al.  Causal Interventional Training for Image Recognition , 2023, IEEE Transactions on Multimedia.

[48]  Shuihua Wang,et al.  Enhanced Feature Alignment for Unsupervised Domain Adaptation of Semantic Segmentation , 2022, IEEE Transactions on Multimedia.

[49]  Yi-Ping Phoebe Chen,et al.  Multi-Classes and Motion Properties for Concurrent Visual SLAM in Dynamic Environments , 2022, IEEE Transactions on Multimedia.