Group Equivariant BEV for 3D Object Detection

Recently, 3D object detection has attracted significant attention and achieved continuous improvement in real road scenarios. The environmental information is collected from a single sensor or multi-sensor fusion to detect interested objects. However, most of the current 3D object detection approaches focus on developing advanced network architectures to improve the detection precision of the object rather than considering the dynamic driving scenes, where data collected from sensors equipped in the vehicle contain various perturbation features. As a result, existing work cannot still tackle the perturbation issue. In order to solve this problem, we propose a group equivariant bird's eye view network (GeqBevNet) based on the group equivariant theory, which introduces the concept of group equivariant into the BEV fusion object detection network. The group equivariant network is embedded into the fused BEV feature map to facilitate the BEV-level rotational equivariant feature extraction, thus leading to lower average orientation error. In order to demonstrate the effectiveness of the GeqBevNet, the network is verified on the nuScenes validation dataset in which mAOE can be decreased to 0.325. Experimental results demonstrate that GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene and improve the performance of object orientation prediction.

[1]  Jiaming Lei,et al.  DuEqNet: Dual-Equivariance Network in Outdoor 3D Object Detection for Autonomous Driving , 2023, 2023 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Diange Yang,et al.  Bridging the View Disparity Between Radar and Camera Features for Multi-Modal Fusion 3D Object Detection , 2022, IEEE Transactions on Intelligent Vehicles.

[3]  Junjie Huang,et al.  BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment , 2022, ArXiv.

[4]  Hai Wu,et al.  Transformation-Equivariant 3D Object Detection for Autonomous Driving , 2022, AAAI.

[5]  Helder Araújo,et al.  MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer for Autonomous Driving , 2022, arXiv.org.

[6]  Xian Wei,et al.  Continual Learning for Pose-Agnostic Object Recognition in 3D Point Clouds , 2022, ArXiv.

[7]  Zeming Li,et al.  BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection , 2022, AAAI.

[8]  Kaicheng Yu,et al.  BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework , 2022, NeurIPS.

[9]  Huizi Mao,et al.  BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation , 2022, 2023 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Jifeng Dai,et al.  BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers , 2022, ECCV.

[11]  Yong Wang,et al.  Adaptive Fusion CNN Features for RGBT Object Tracking , 2021, IEEE Transactions on Intelligent Transportation Systems.

[12]  Qixiang Ye,et al.  FreeAnchor: Learning to Match Anchors for Visual Object Detection , 2019, NeurIPS.

[13]  Yilun Wang,et al.  DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries , 2021, CoRL.

[14]  Erran L. Li,et al.  Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  S. Herman,et al.  Single Camera Object Detection for Self-Driving Vehicle: A Review , 2021, Journal of the Society of Automotive Engineers Malaysia.

[16]  Yee Whye Teh,et al.  LieTransformer: Equivariant self-attention for Lie Groups , 2020, ICML.

[17]  H. Qi,et al.  CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Philipp Krähenbühl,et al.  Center-based 3D Object Detection and Tracking , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Alberto Ferreira de Souza,et al.  Self-Driving Cars: A Survey , 2019, Expert Syst. Appl..

[20]  Sanja Fidler,et al.  Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D , 2020, ECCV.

[21]  Jun Won Choi,et al.  3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection , 2020, ECCV.

[22]  Yan Wang,et al.  End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Pavel Izmailov,et al.  Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data , 2020, ICML.

[24]  A. Yuille,et al.  Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots , 2019, ECCV.

[25]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Alex H. Lang,et al.  PointPainting: Sequential Fusion for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Xuan Tang,et al.  An Enhanced SqueezeNet Based Network for Real-Time Road-Object Segmentation , 2019, 2019 IEEE Symposium Series on Computational Intelligence (SSCI).

[29]  Robert Kozma,et al.  2019 IEEE Symposium Series on Computational Intelligence , 2019, IEEE Computational Intelligence Magazine.

[30]  Yan Wang,et al.  Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Chenyang Lu,et al.  Monocular Semantic Occupancy Grid Mapping With Convolutional Variational Encoder–Decoder Networks , 2018, IEEE Robotics and Automation Letters.

[33]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[34]  Bo Li,et al.  SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[35]  James M. Rehg,et al.  3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  James M. Conrad,et al.  A survey of methods for mobile robot localization and mapping in dynamic indoor environments , 2018, 2018 Conference on Signal Processing And Communication Engineering Systems (SPACES).

[38]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[39]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Koray Kavukcuoglu,et al.  Exploiting Cyclic Symmetry in Convolutional Neural Networks , 2016, ICML.

[43]  C. Qi Deep Learning on Point Sets for 3 D Classification and Segmentation , 2016 .

[44]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  J. Little,et al.  Inverse perspective mapping simplifies optical flow computation and obstacle detection , 2004, Biological Cybernetics.