论文信息 - RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection

RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection

The detection of 3D objects from LiDAR data is a critical component in most autonomous driving systems. Safe, high speed driving needs larger detection ranges, which are enabled by new LiDARs. These larger detection ranges require more efficient and accurate detection models. Towards this goal, we propose Range Sparse Net (RSN) – a simple, efficient, and accurate 3D object detector – in order to tackle real time 3D object detection in this extended detection regime. RSN predicts foreground points from range images and applies sparse convolutions on the selected foreground points to detect objects. The lightweight 2D convolutions on dense range images results in significantly fewer selected foreground points, thus enabling the later sparse convolutions in RSN to efficiently operate. Combining features from the range image further enhance detection accuracy. RSN runs at more than 60 frames per second on a 150m × 150m detection region on Waymo Open Dataset (WOD) while being more accurate than previously published detectors. As of 11/2020, RSN is ranked first in the WOD leaderboard based on the APH/LEVEL_1 metrics for LiDAR-based pedestrian and vehicle detection, while being several times faster than alternatives.

[1] Leonidas J. Guibas,et al. Deep Hough Voting for 3D Object Detection in Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2] Yin Zhou,et al. End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds , 2019, CoRL.

[3] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Shaoshuai Shi,et al. PV-RCNN: The Top-Performing LiDAR-only Solutions for 3D Detection / 3D Tracking / Domain Adaptation of Waymo Open Dataset Challenges , 2020, ArXiv.

[6] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Leonidas J. Guibas,et al. Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[9] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[10] Laurens van der Maaten,et al. Submanifold Sparse Convolutional Networks , 2017, ArXiv.

[11] Bo Li,et al. SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[12] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[13] Dragomir Anguelov,et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Carlos Vallespi-Gonzalez,et al. LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16] Xingyi Zhou,et al. Objects as Points , 2019, ArXiv.

[17] Bin Yang,et al. PIXOR: Real-time 3D Object Detection from Point Clouds , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Xiaogang Wang,et al. PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Yin Zhou,et al. StarNet: Targeted Computation for Object Detection in Point Clouds , 2019, ArXiv.

[24] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Yu Wang,et al. AFDet: Anchor Free One Stage 3D Object Detection , 2020, ArXiv.

[26] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[27] Xiaoyong Shen,et al. STD: Sparse-to-Dense 3D Object Detector for Point Cloud , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Shiliang Pu,et al. RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range Image Representation , 2020, ArXiv.

[29] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Yue Wang,et al. Pillar-based Object Detection for Autonomous Driving , 2020, ECCV.

[31] Rares Ambrus,et al. 3D Packing for Self-Supervised Monocular Depth Estimation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Yin Zhou,et al. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33] Dragomir Anguelov,et al. Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection , 2020, CoRL.

[34] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[35] Dushyant Rao,et al. Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[36] Quoc V. Le,et al. Improving 3D Object Detection through Progressive Population Based Augmentation , 2020, ECCV.

[37] Hei Law,et al. CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[38] Ji Wan,et al. Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Jiong Yang,et al. PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Ruigang Yang,et al. IoU Loss for 2D/3D Object Detection , 2019, 2019 International Conference on 3D Vision (3DV).