Det2Seg: A Two-Stage Approach for Road Object Segmentation from 3D Point Clouds

Object segmentation from 3D point clouds is an important topic in real-world applications. However, due to the sparsity and irregularity of point clouds, instance segmentation often suffers unsatisfying performance in various scenes such as autonomous driving. There are two main difficulties. First, in a wild scene, background noise often occupies a majority of the entire point set. Second, small-scale objects are often difficult to be recognized due to larger uncertainty. In this paper, we propose Det2Seg, a two-stage approach which alleviates the above issues towards more accurate instance segmentation of road-objects. In the first stage, we adopt Pointpillars [1] to detect the regions-of- interest that can localize and classify objects in a coarse level; in the second stage, we extract points from the detected regions into pillars, encode them into a new data format and feed it into a 2D convolutional neural network to perform fine-grained, domain-specific instance segmentation. We evaluate our approach on raw LiDAR (Light Detection And Ranging) data from the KITTI dataset [2]. The experimental results show that our approach largely outperforms the prior researches. In particular, our approach stands out for its significant ability on recognizing and segmenting small-scale objects, i.e., an improvement of over 20% in terms of Intersection over Union (IoU), beyond state-of- the-arts, is obtained for the cyclist class.

[1]  Kurt Keutzer,et al.  SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[2]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[3]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Kurt Keutzer,et al.  SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Yuan Wang,et al.  PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud , 2018, ArXiv.

[6]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Bin Yang,et al.  HDNET: Exploiting HD Maps for 3D Object Detection , 2018, CoRL.

[9]  Bo Li,et al.  3D fully convolutional network for vehicle detection in point cloud , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Dushyant Rao,et al.  Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).