Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation

Semantic segmentation of point clouds in autonomous driving datasets requires techniques that can process large numbers of points over large field of views. Today, most deep networks designed for this task exploit 3D sparse convolutions to reduce memory and computational loads. The best methods then further exploit specificities of rotating lidar sampling patterns to further improve the performance, e.g., cylindrical voxels, or range images (for feature fusion from multiple point cloud representations). In contrast, we show that one can build a well-performing point-based backbone free of these specialized tools. This backbone, WaffleIron, relies heavily on generic MLPs and dense 2D convolutions, making it easy to implement, and contains just a few parameters easy to tune. Despite its simplicity, our experiments on SemanticKITTI and nuScenes show that WaffleIron competes with the best methods designed specifically for these autonomous driving datasets. Hence, WaffleIron is a strong, easy-to-implement, baseline for semantic segmentation of sparse outdoor point clouds.

[1]  Shenghui Cui,et al.  2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds , 2022, ECCV.

[2]  Mohamed Elhoseiny,et al.  PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies , 2022, NeurIPS.

[3]  Chen Change Loy,et al.  Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Haotian Tang,et al.  TorchSparse: Efficient Point Cloud Inference Engine , 2022, MLSys.

[5]  Jiaya Jia,et al.  Stratified Transformer for 3D Point Cloud Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Y. Fu,et al.  Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework , 2022, ICLR.

[7]  Minsu Cho,et al.  Fast Point Transformer , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  François Rameau,et al.  PointMixer: MLP-Mixer for Point Cloud Understanding , 2021, ECCV.

[9]  A. Dosovitskiy,et al.  MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.

[10]  Matthieu Cord,et al.  Going deeper with Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Ran Cheng,et al.  Lite-HDSeg: LiDAR Semantic Segmentation Using Lite Harmonic Dense Convolutions , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Bingbing Liu,et al.  (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Shuguang Cui,et al.  Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion , 2020, AAAI.

[14]  Xinge Zhu,et al.  Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bingbing Liu,et al.  TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module , 2020, IEEE International Conference on Robotics and Automation.

[16]  Venice Erin Liong,et al.  AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic Segmentation , 2020, ArXiv.

[17]  Dariu M. Gavrila,et al.  SCSSnet: Learning Spatially-Conditioned Scene Segmentation on LiDAR Point Clouds , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[18]  Anne Verroust-Blondet,et al.  LMSCNet: Lightweight Multiscale 3D Semantic Completion , 2020, 2020 International Conference on 3D Vision (3DV).

[19]  Song Han,et al.  Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution , 2020, ECCV.

[20]  Le Hui,et al.  Cascaded Non-local Neural Network for Point Cloud Semantic Segmentation , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Olaf Booij,et al.  KPRNet: Improving projection-based LiDAR semantic segmentation , 2020, ArXiv.

[22]  G. Puy,et al.  FKAConv: Feature-Kernel Alignment for Point Cloud Convolution , 2020, ACCV.

[23]  Bichen Wu,et al.  SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation , 2020, ECCV.

[24]  Philip David,et al.  PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Eren Erdal Aksoy,et al.  SalsaNext: Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds , 2020, ISVC.

[26]  Zhuguo Li,et al.  PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Sven Behnke,et al.  LatticeNet: Fast Point Cloud Segmentation Using Permutohedral Lattices , 2019, Robotics: Science and Systems.

[28]  A. Markham,et al.  RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Feihu Zhang,et al.  Deep FusionNet for Point Cloud Semantic Segmentation , 2020, ECCV.

[31]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[32]  Cyrill Stachniss,et al.  RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Silvio Savarese,et al.  4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Leonidas J. Guibas,et al.  KPConv: Flexible and Deformable Convolution for Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Cyrill Stachniss,et al.  SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[37]  Martin Simonovsky,et al.  Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Matthew B. Blaschko,et al.  The Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[40]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).