Improving Lidar-Based Semantic Segmentation of Top-View Grid Maps by Learning Features in Complementary Representations

In this paper we introduce a novel way to predict semantic information from sparse, single-shot LiDAR measurements in the context of autonomous driving. In particular, we fuse learned features from complementary representations. The approach is aimed specifically at improving the semantic segmentation of top-view grid maps. Towards this goal the 3D LiDAR point cloud is projected onto two orthogonal 2D representations. For each representation a tailored deep learning architecture is developed to effectively extract semantic information which are fused by a superordinate deep neural network. The contribution of this work is threefold: (1) We examine different stages within the segmentation network for fusion. (2) We quantify the impact of embedding different features. (3) We use the findings of this survey to design a tailored deep neural network architecture leveraging respective advantages of different representations. Our method is evaluated using the SemanticKITTI dataset which provides a point-wise semantic annotation of more than 23.000 LiDAR measurements.

[1]  Mohammed Bennamoun,et al.  Deep Learning for 3D Point Clouds: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Sascha Wirges,et al.  Exploiting Multi-Layer Grid Maps for Surround-View Semantic Segmentation of Sparse LiDAR Data , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[3]  A. Markham,et al.  RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Cyrill Stachniss,et al.  RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Fusion of range measurements and semantic estimates in an evidential framework / Fusion von Distanzmessungen und semantischen Größen im Rahmen der Evidenztheorie , 2019, tm - Technisches Messen.

[6]  Martin Lauer,et al.  Capturing Object Detection Uncertainty in Multi-Layer Grid Maps , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[7]  Kurt Keutzer,et al.  SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[8]  Sascha Wirges,et al.  Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[9]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[12]  Marc Pollefeys,et al.  The Stixel World: A medium-level representation of traffic scenes , 2017, Image Vis. Comput..

[13]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jing Huang,et al.  Point cloud labeling using 3D Convolutional Neural Network , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[15]  Daniel Cremers,et al.  FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture , 2016, ACCV.

[16]  Wolfram Burgard,et al.  Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Ingmar Posner,et al.  Voting for Voting in Online Point Cloud Object Detection , 2015, Robotics: Science and Systems.

[18]  Avideh Zakhor,et al.  Sensor fusion for semantic segmentation of urban scenes , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jitendra Malik,et al.  Simultaneous Detection and Segmentation , 2014, ECCV.

[21]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[23]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Uwe Franke,et al.  Towards a Global Optimal Multi-Layer Stixel Representation of Dense 3D Data , 2011, BMVC.

[26]  Multi-level partition of unity implicits , 2005, ACM Trans. Graph..

[27]  Viii Supervisor Sonar-Based Real-World Mapping and Navigation , 2001 .

[28]  Hans P. Moravec Sensor Fusion in Certainty Grids for Mobile Robots , 1988, AI Mag..