SurfaceNet: A Surface Focused Network for Pedestrian Detection and Segmentation in 3D Point Clouds

Pedestrian detection is an important problem for autonomous driving. It is still chanllenging to detect and segment pedestrians from point clouds. In this paper, we propose a method named SurfaceNet to detect and segment pedestrians from point clouds. Specifically, we propose a novel representation, named surface map, to represent a point cloud as a 2D pseudo-image. For pedestrian detection, the proposed method comprises of four modules: 1) a grid feature encoder that can processes arbitrary number of points within each grid; 2) a surface feature convolutional module that employs a set of 2D convolutional layers to extract high level features; 3) a view transform module that transforms features from front view to bird's eye view; and 4) an anchor-free 3D object detection head that produces rotated 3D bounding box predictions. For semantic segmentation, the 2D pseudo-image is used for semantic segmentation and the segmentation results are re-projected to the original point cloud to achieve point cloud segmentation. Experimental results on the KITTI dataset show that our method achieves promising performance on pedestrian detection and segmentation in point clouds.

[1]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Michael Felsberg,et al.  Deep Projective 3D Semantic Segmentation , 2017, CAIP.

[3]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[6]  Bin Yang,et al.  PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Nuno Vasconcelos,et al.  Learning Complexity-Aware Cascades for Pedestrian Detection , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Andrea Simonelli,et al.  Disentangling Monocular 3D Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Li Liu,et al.  Deep Learning for 3D Point Clouds: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.

[12]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[14]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[15]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[16]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[18]  Bo Yang,et al.  RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[20]  Steven Lake Waslander,et al.  Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Xiaoyong Shen,et al.  STD: Sparse-to-Dense 3D Object Detector for Point Cloud , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.