论文信息 - Enhancing Grid-Based 3D Object Detection in Autonomous Driving With Improved Dimensionality Reduction

Enhancing Grid-Based 3D Object Detection in Autonomous Driving With Improved Dimensionality Reduction

Point cloud object detection is a pivotal technology in autonomous driving and robotics. Currently, the majority of cutting-edge point cloud detectors utilize Bird’s Eye View (BEV) for detection, as it allows them to take advantage of well-explored 2D detection techniques. Nevertheless, dimensionality reduction of features from 3D space to BEV space unavoidably leads to information loss, and there is a lack of research on this issue. Existing methods typically obtain BEV features by collapsing voxel or point features along the height dimension via a pooling operation or convolution, resulting in a significant decrease in geometric information. To tackle this problem, we present a new point cloud backbone network for grid-based object detection, MDRNet, which is based on adaptive dimensionality reduction and multi-level spatial residual strategies. In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically concentrate on the essential components of the object during 3D-to-BEV transformation. Moreover, the Multi-level Spatial Residuals (MSR) strategy is proposed to effectively fuse multi-level spatial information in BEV feature maps. Our MDRNet can be employed on any existing grid-based object detector, resulting in a remarkable improvement in performance. Numerous experiments conducted on nuScenes, KITTI and DAIR-V have shown that MDRNet surpasses existing SOTA approaches. In particular, on the nuScenes dataset, we attained an impressive 7.2% mAP and 5.0% NDS enhancement compared with CenterPoint.

[1] Simegnew Yihunie Alaba,et al. WCNN3D: Wavelet Convolutional Neural Network-Based 3D Object Detection for Autonomous Driving , 2022, Sensors.

[2] Jiaya Jia,et al. LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Kaicheng Yu,et al. BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework , 2022, NeurIPS.

[4] Huizi Mao,et al. BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).

[5] Ruifeng Li,et al. PillarNet: Real-Time and High-Performance Pillar-based 3D Object Detection , 2022, ECCV.

[6] Jiaya Jia,et al. Focal Sparse Convolutional Networks for 3D Object Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Zaiqing Nie,et al. DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] K. Jia,et al. VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Zhiyuan Zhang,et al. CVFNet: Real-time 3D Object Detection by Learning Cross View Features , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10] M. Barth,et al. PillarGrid: Deep Learning-Based Cooperative Perception for 3D Object Detection from Onboard-Roadside LiDAR , 2022, 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC).

[11] Jonathan Li,et al. CasA: A Cascade Attention Network for 3-D Object Detection From LiDAR Point Clouds , 2022, IEEE Transactions on Geoscience and Remote Sensing.

[12] Philipp Krähenbühl,et al. Multimodal Virtual Point 3D Detection , 2021, NeurIPS.

[13] Minzhe Niu,et al. Voxel Transformer for 3D Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Dingfu Zhou,et al. FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection , 2021, 2021 IEEE International Intelligent Transportation Systems Conference (ITSC).

[15] Wengang Zhou,et al. Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection , 2020, AAAI.

[16] Philipp Krähenbühl,et al. Center-based 3D Object Detection and Tracking , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Zhuo Yang,et al. MuRF-Net: Multi-Receptive Field Pillars for 3D Object Detection from Point Cloud , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[18] Yue Wang,et al. Pillar-based Object Detection for Autonomous Driving , 2020, ECCV.

[19] Larry S. Davis,et al. InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling , 2020, ECCV.

[20] Jun Won Choi,et al. 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection , 2020, ECCV.

[21] Vladlen Koltun,et al. Tracking Objects as Points , 2020, ECCV.

[22] Yanan Sun,et al. 3DSSD: Point-Based 3D Single Stage Object Detector , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Xiaogang Wang,et al. PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] A. Yuille,et al. Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots , 2019, ECCV.

[25] Alex H. Lang,et al. PointPainting: Sequential Fusion for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Qiang Xu,et al. nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Alan L. Yuille,et al. Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization , 2020, NeurIPS.

[28] Benjin Zhu,et al. Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection , 2019, ArXiv.

[29] Xingyi Zhou,et al. Objects as Points , 2019, ArXiv.

[30] Jiong Yang,et al. PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Xiaogang Wang,et al. PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Bo Li,et al. SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[33] Nuno Vasconcelos,et al. Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Yin Zhou,et al. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35] Trevor Darrell,et al. Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36] Leonidas J. Guibas,et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[37] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Huimin Ma,et al. 3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[40] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..