Adversarial unsupervised domain adaptation for 3D semantic segmentation with multi-modal learning

Abstract Semantic segmentation in 3D point-clouds plays an essential role in various applications, such as autonomous driving, robot control, and mapping. In general, a segmentation model trained on one source domain suffers a severe decline in performance when applied to a different target domain due to the cross-domain discrepancy. Various Unsupervised Domain Adaptation (UDA) approaches have been proposed to tackle this issue. However, most are only for uni-modal data and do not explore how to learn from the multi-modality data containing 2D images and 3D point clouds. We propose an Adversarial Unsupervised Domain Adaptation (AUDA) based 3D semantic segmentation framework for achieving this goal. The proposed AUDA can leverage the complementary information between 2D images and 3D point clouds by cross-modal learning and adversarial learning. On the other hand, there is a highly imbalanced data distribution in real scenarios. We further develop a simple and effective threshold-moving technique during the final inference stage to mitigate this issue. Finally, we conduct experiments on three unsupervised domain adaptation scenarios, ie., Country-to-Country (USA →Singapore), Day-to-Night, and Dataset-to-Dataset (A2D2 →SemanticKITTI). The experimental results demonstrate the effectiveness of proposed method that can significantly improve segmentation performance for rare classes. Code and trained models are available at https://github.com/weiliu-ai/auda.

[1]  Klaus C. J. Dietmayer,et al.  Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges , 2019, IEEE Transactions on Intelligent Transportation Systems.

[2]  Raoul de Charette,et al.  xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Patrick Pérez,et al.  ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Deng Cai,et al.  Domain Adaptation for Semantic Segmentation With Maximum Squares Loss , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Pietro Zanuttigh,et al.  Unsupervised Domain Adaptation in Semantic Segmentation: a Review , 2020, ArXiv.

[7]  Min Sun,et al.  No More Discrimination: Cross City Adaptation of Road Scene Segmenters , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nuno Vasconcelos,et al.  Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yang Zou,et al.  Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training , 2018, ArXiv.

[11]  Shiming Xiang,et al.  Relation-Shape Convolutional Neural Network for Point Cloud Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Lingfeng Wang,et al.  Semantic Labeling in Very High Resolution Images via a Self-Cascaded Convolutional Neural Network , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[13]  Akira Iwasaki,et al.  Unsupervised Domain Adaptation of High-Resolution Aerial Images via Correlation Alignment and Self Training , 2021, IEEE Geoscience and Remote Sensing Letters.

[14]  Wei-Lun Chang,et al.  All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bisheng Yang,et al.  A new method for 3D individual tree extraction using multispectral airborne LiDAR point clouds , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[16]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[17]  Lei Wang,et al.  Appendix for : Graph Attention Convolution for Point Cloud Semantic Segmentation , 2019 .

[18]  Xia Li,et al.  Domain adaptation for land use classification: A spatio-temporal knowledge reusing method , 2014 .

[19]  Gianluca Agresti,et al.  Synth . segmentation Real segmentation Synth . GT Synth . RGB Real RGB Fully Convolutional Discriminator synthetic path real path Region Growing , 2019 .

[20]  Andrés Serna,et al.  Detection, segmentation and classification of 3D urban objects using mathematical morphology and supervised learning , 2014 .

[21]  Bin Yang,et al.  Multi-Task Multi-Sensor Fusion for 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Chunhong Pan,et al.  Triplet Adversarial Domain Adaptation for Pixel-Level Classification of VHR Remote Sensing Images , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Wolfram Burgard,et al.  Self-Supervised Model Adaptation for Multimodal Semantic Segmentation , 2018, International Journal of Computer Vision.

[24]  Laurens van der Maaten,et al.  3D Semantic Segmentation with Submanifold Sparse Convolutional Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  David J. Kriegman,et al.  Image to Image Translation for Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Ming-Hsuan Yang,et al.  CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Haifeng Luo,et al.  Unsupervised scene adaptation for semantic segmentation of urban mobile laser scanning point clouds , 2020 .

[29]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Kurt Keutzer,et al.  SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[31]  Dong Liu,et al.  Fully Convolutional Adaptation Networks for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Zhenzhong Chen,et al.  Superpixel-enhanced deep neural forest for remote sensing image semantic segmentation , 2020 .

[33]  Yi Yang,et al.  Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Kurt Keutzer,et al.  Multi-source Distilling Domain Adaptation , 2020, AAAI.

[35]  Cyrill Stachniss,et al.  SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  François Rameau,et al.  Unsupervised Intra-Domain Adaptation for Semantic Segmentation Through Self-Supervision , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Fabio Remondino,et al.  A REVIEW OFPOINT CLOUDS SEGMENTATION AND CLASSIFICATION ALGORITHMS , 2017 .

[38]  Li Yan,et al.  Semi-supervised center-based discriminative adversarial learning for cross-domain scene-level land-cover classification of aerial images , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[39]  Mohsen Ali,et al.  Weakly Supervised Domain Adaptation for Built-up Region Segmentation in Aerial and Satellite Imagery , 2020, ArXiv.

[40]  Kurt Keutzer,et al.  Multi-source Domain Adaptation for Semantic Segmentation , 2019, NeurIPS.

[41]  Jake Charland,et al.  Sensor Fusion for Joint 3D Object Detection and Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Bin Yang,et al.  Deep Continuous Fusion for Multi-sensor 3D Object Detection , 2018, ECCV.

[43]  Jingang Tan,et al.  SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Rongrong Ji,et al.  Detection based object labeling of 3D point cloud for indoor scenes , 2016, Neurocomputing.

[46]  Zhiming Luo,et al.  MIO-TCD: A New Benchmark Dataset for Vehicle Classification and Localization , 2018, IEEE Transactions on Image Processing.

[47]  Hui Zhou,et al.  Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation , 2018, ECCV.

[48]  Hao Su,et al.  Multi-View PointNet for 3D Scene Understanding , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[49]  Diane J. Cook,et al.  A Survey of Unsupervised Deep Domain Adaptation , 2018, ACM Trans. Intell. Syst. Technol..

[50]  Xiang Li,et al.  Deep Learning-Based Image Segmentation on Multimodal Medical Imaging , 2019, IEEE Transactions on Radiation and Plasma Medical Sciences.

[51]  Luc Van Gool,et al.  ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Yi-Hsuan Tsai,et al.  Domain Adaptation for Structured Output via Discriminative Patch Representations , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[53]  Winston H. Hsu,et al.  A Unified Point-Based Framework for 3D Segmentation , 2019, 2019 International Conference on 3D Vision (3DV).

[54]  Silvio Savarese,et al.  Adversarial Feature Augmentation for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Patrick Pérez,et al.  DADA: Depth-Aware Domain Adaptation in Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[56]  Namil Kim,et al.  Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[57]  C.-C. Jay Kuo,et al.  PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation , 2019, NeurIPS.

[58]  Kurt Keutzer,et al.  SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[59]  Vittorio Murino,et al.  Minimal-Entropy Correlation Alignment for Unsupervised Deep Domain Adaptation , 2017, ICLR.

[60]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[62]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[63]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.