论文信息 - Confidence Guided Stereo 3D Object Detection with Split Depth Estimation

Confidence Guided Stereo 3D Object Detection with Split Depth Estimation

Accurate and reliable 3D object detection is vital to safe autonomous driving. Despite recent developments, the performance gap between stereo-based methods and LiDAR-based methods is still considerable. Accurate depth estimation is crucial to the performance of stereo-based 3D object detection methods, particularly for those pixels associated with objects in the foreground. Moreover, stereo-based methods suffer from high variance in the depth estimation accuracy, which is often not considered in the object detection pipeline. To tackle these two issues, we propose CG-Stereo, a confidence-guided stereo 3D object detection pipeline that uses separate decoders for foreground and background pixels during depth estimation, and leverages the confidence estimation from the depth estimation network as a soft attention mechanism in the 3D object detector. Our approach outperforms all state-of-the-art stereo-based 3D detectors on the KITTI benchmark.

Steven L. Waslander | Jason Ku | Chengyao Li | Chengyao Li | Jason Ku

[1] Ruigang Yang,et al. GA-Net: Guided Aggregation Net for End-To-End Stereo Matching , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Richard Szeliski,et al. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[3] Sanja Fidler,et al. Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Steven Lake Waslander,et al. Joint 3D Proposal Generation and Object Detection from View Aggregation , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5] Huimin Ma,et al. 3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[6] Xiaoyong Shen,et al. DSGN: Deep Stereo Geometry Network for 3D Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Takeo Kanade,et al. Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Bo Li,et al. SECOND: Sparsely Embedded Convolutional Detection , 2018, Sensors.

[9] Vincent Lepetit,et al. A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Steven L. Waslander,et al. Object-Centric Stereo Matching for 3D Object Detection , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[11] Xiaogang Wang,et al. PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Jiong Yang,et al. PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Yan Wang,et al. Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving , 2019, ICLR.

[14] Yin Zhou,et al. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15] Christoph Stiller,et al. Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[16] Sebastian Ramos,et al. The Cityscapes Dataset , 2015, CVPR 2015.

[17] Ji Wan,et al. Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] D. Scharstein,et al. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[19] Shaojie Shen,et al. Stereo R-CNN Based 3D Object Detection for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Leonidas J. Guibas,et al. Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Thomas Brox,et al. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Tao Kong,et al. Task-Aware Monocular Depth Estimation for 3D Object Detection , 2020, AAAI.

[23] Xiaogang Wang,et al. PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Vladimir Kolmogorov,et al. Computing visual correspondence with occlusions using graph cuts , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[25] Steven Lake Waslander,et al. In Defense of Classical Image Processing: Fast Depth Completion on the CPU , 2018, 2018 15th Conference on Computer and Robot Vision (CRV).

[26] Yong-Sheng Chen,et al. Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27] Yan Wang,et al. Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29] Shawn D. Newsam,et al. Improving Semantic Segmentation via Video Propagation and Label Relaxation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Trevor Darrell,et al. Hierarchical Discrete Distribution Decomposition for Match Density Estimation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Qiang Xu,et al. nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Zengyi Qin,et al. Triangulation Learning Network: From Monocular to Stereo 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Dragomir Anguelov,et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Peter Kontschieder,et al. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35] Krzysztof Czarnecki,et al. Canadian Adverse Driving Conditions dataset , 2020, Int. J. Robotics Res..

[36] Steven L. Waslander,et al. Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Jiaya Jia,et al. Fast Point R-CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).