LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds

. Semantic segmentation of LiDAR point clouds is an impor-tant task in autonomous driving. However, training deep models via con-ventional supervised methods requires large datasets which are costly to label. It is critical to have label-efficient segmentation approaches to scale up the model to new operational domains or to improve performance on rare cases. While most prior works focus on indoor scenes, we are one of the first to propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds. Our method co-designs an efficient labeling process with semi/weakly supervised learning and is applicable to nearly any 3D semantic segmentation backbones. Specifi-cally, we leverage geometry patterns in outdoor scenes to have a heuristic pre-segmentation to reduce the manual labeling and jointly design the learning targets with the labeling process. In the learning step, we leverage prototype learning to get more descriptive point embeddings and use multi-scan distillation to exploit richer semantics from temporally aggregated point clouds to boost the performance of single-scan models. Evaluated on the SemanticKITTI and the nuScenes datasets, we show that our proposed method outperforms existing label-efficient methods. With extremely limited human annotations ( e.g ., 0.1% point labels), our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.

[1]  SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000x Fewer Labels , 2021, ArXiv.

[2]  Cyrill Stachniss,et al.  Multi-Scale Interaction for Real-Time LiDAR Data Segmentation on an Embedded Platform , 2020, IEEE Robotics and Automation Letters.

[3]  B. Koch,et al.  Towards Urban Scene Semantic Segmentation with Deep Learning from LiDAR Point Clouds: A Case Study in Baden-Württemberg, Germany , 2021, Remote. Sens..

[4]  Winston H. Hsu,et al.  ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Alexander G. Schwing,et al.  3D Spatial Recognition without Spatially Labeled 3D , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Xiaojuan Qi,et al.  One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Shiliang Pu,et al.  RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Ran Cheng,et al.  Lite-HDSeg: LiDAR Semantic Segmentation Using Lite Harmonic Dense Convolutions , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Bingbing Liu,et al.  (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Zhiwu Lu,et al.  Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning , 2021, UAI.

[11]  Ke Chen,et al.  Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach , 2021, ArXiv.

[12]  Saining Xie,et al.  Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Timothy D. Barfoot,et al.  Self-Supervised Learning of Lidar Segmentation for Autonomous Indoor Navigation , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Shuguang Cui,et al.  Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion , 2020, AAAI.

[15]  Xinge Zhu,et al.  Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Mohamed ElHelw,et al.  Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[17]  Bingbing Liu,et al.  TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module , 2020, IEEE International Conference on Robotics and Automation.

[18]  Tat-Seng Chua,et al.  Few-shot 3D Point Cloud Semantic Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Huijing Zhao,et al.  Are We Hungry for 3D LiDAR Data for Semantic Segmentation? , 2020, ArXiv.

[20]  Junnan Li,et al.  Prototypical Contrastive Learning of Unsupervised Representations , 2020, ICLR.

[21]  Shijian Lu,et al.  SynLiDAR: Learning From Synthetic LiDAR Sequential Point Cloud for Semantic Segmentation , 2021, ArXiv.

[22]  Venice Erin Liong,et al.  AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic Segmentation , 2020, ArXiv.

[23]  J. Beyerer,et al.  LiDAR-based Recurrent 3D Semantic Segmentation with Temporal Memory Alignment , 2020, 2020 International Conference on 3D Vision (3DV).

[24]  Yuan Zong,et al.  Spatial Transformer Point Convolution , 2020, ArXiv.

[25]  Song Han,et al.  Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution , 2020, ECCV.

[26]  Le Hui,et al.  Cascaded Non-local Neural Network for Point Cloud Semantic Segmentation , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Olaf Booij,et al.  KPRNet: Improving projection-based LiDAR semantic segmentation , 2020, ArXiv.

[28]  Leonidas J. Guibas,et al.  PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding , 2020, ECCV.

[29]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[30]  Gim Hee Lee,et al.  Weakly Supervised Semantic Point Cloud Segmentation: Towards 10× Fewer Labels , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Bichen Wu,et al.  SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation , 2020, ECCV.

[32]  Philip David,et al.  PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Guosheng Lin,et al.  Multi-Path Region Mining for Weakly Supervised 3D Semantic Segmentation on Point Clouds , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Eren Erdal Aksoy,et al.  SalsaNext: Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds , 2020, ISVC.

[35]  Luis Riazuelo,et al.  3D-MiniNet: Learning a 2D Representation From Point Clouds for Fast and Efficient 3D LIDAR Semantic Segmentation , 2020, IEEE Robotics and Automation Letters.

[36]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[37]  A. Markham,et al.  RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  J. Grizzle,et al.  Bayesian Spatial Kernel Smoothing for Scalable Dense Semantic Mapping , 2019, IEEE Robotics and Automation Letters.

[40]  Jilin Mei,et al.  Incorporating Human Domain Knowledge in 3-D LiDAR-Based Semantic Segmentation , 2019, IEEE Transactions on Intelligent Vehicles.

[41]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Huijing Zhao,et al.  Semantic Segmentation of 3D LiDAR Data in Dynamic Scene Using Semi-Supervised Learning , 2018, IEEE Transactions on Intelligent Transportation Systems.

[43]  Feihu Zhang,et al.  Deep FusionNet for Point Cloud Semantic Segmentation , 2020, ECCV.

[44]  Cyrill Stachniss,et al.  RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[45]  Shingo Ando,et al.  Semantic Segmentation of Sparsely Annotated 3D Point Clouds by Pseudo-Labelling , 2019, 2019 International Conference on 3D Vision (3DV).

[46]  Leonidas J. Guibas,et al.  KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Cyrill Stachniss,et al.  SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Liang Yang,et al.  Towards Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes , 2019, BMVC.

[49]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[50]  Fei Yin,et al.  Robust Classification with Convolutional Prototype Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Chenglu Wen,et al.  Semantic Labeling of Mobile LiDAR Point Clouds via Active Learning and Higher Order MRF , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[52]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[53]  Hossein Mobahi,et al.  Large Margin Deep Networks for Classification , 2018, NeurIPS.

[54]  Martin Simonovsky,et al.  Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Loic Landrieu,et al.  WEAKLY SUPERVISED SEGMENTATION-AIDED CLASSIFICATION OF URBAN SCENES FROM 3D LIDAR POINT CLOUDS , 2017 .

[57]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[58]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Silvio Savarese,et al.  Joint 2D-3D-Semantic Data for Indoor Scene Understanding , 2017, ArXiv.

[60]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[61]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[62]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[63]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[65]  Trevor Darrell,et al.  Fully Convolutional Multi-Class Multiple Instance Learning , 2014, ICLR.

[66]  Ronan Collobert,et al.  From image-level to pixel-level labeling with Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[68]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.