AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Existing deep learning-based approaches for monocular 3D object detection in autonomous driving often model the object as a rotated 3D cuboid while the object’s geometric shape has been ignored. In this work, we propose an approach for incorporating the shape-aware 2D/3D constraints into the 3D detection framework. Specifically, we employ the deep neural network to learn distinguished 2D keypoints in the 2D image domain and regress their corresponding 3D coordinates in the local 3D object coordinate first. Then the 2D/3D geometric constraints are built by these correspondences for each object to boost the detection performance. For generating the ground truth of 2D/3D keypoints, an automatic model-fitting approach has been proposed by fitting the deformed 3D object model and the object mask in the 2D image. The proposed framework has been verified on the public KITTI dataset and the experimental results demonstrate that by using additional geometrical constraints the detection performance has been significantly improved as compared to the baseline method. More importantly, the proposed framework achieves state-of-the-art performance with real time. Data and code will be available at https://github.com/

[1]  Steven L. Waslander,et al.  Categorical Depth Distribution Network for Monocular 3D Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[3]  Haojie Li,et al.  Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Ruigang Yang,et al.  IoU Loss for 2D/3D Object Detection , 2019, 2019 International Conference on 3D Vision (3DV).

[5]  Haojie Li,et al.  Delving into Localization Errors for Monocular 3D Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Hongzi Zhu,et al.  Monocular 3D Object Detection: An Extrinsic Parameter Free Approach , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Dinesh Manocha,et al.  PerMO: Perceiving More at Once from a Single Image for Autonomous Driving , 2020, ArXiv.

[8]  Ming Liu,et al.  YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Peixuan Li,et al.  Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training , 2020, ArXiv.

[10]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Mingyang Li,et al.  MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Xiaogang Wang,et al.  GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[14]  Trevor Darrell,et al.  Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Xiaoming Liu,et al.  M3D-RPN: Monocular 3D Region Proposal Network for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Zhiwu Lu,et al.  Learning Depth-Guided Convolutions for Monocular 3D Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Vincent Frémont,et al.  On modeling ego-motion uncertainty for moving object detection from a mobile platform , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[18]  Xiaoyong Shen,et al.  Amodal Instance Segmentation With KINS Dataset , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Kris Kitani,et al.  Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[20]  Jiong Yang,et al.  PointPillars: Fast Encoders for Object Detection From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Tao Kong,et al.  Task-Aware Monocular Depth Estimation for 3D Object Detection , 2020, AAAI.

[22]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Adrien Gaidon,et al.  ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  H. Hirschmüller Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information , 2005, CVPR.

[25]  Adrien Gaidon,et al.  Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jin Fang,et al.  IAFA: Instance-Aware Feature Aggregation for 3D Object Detection from a Single Image , 2021, ACCV.

[27]  Jiwen Lu,et al.  Objects are Different: Flexible Monocular 3D Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Gabriel J. Brostow,et al.  Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Shubhra Aich,et al.  RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving , 2019, ArXiv.

[31]  Hao Li,et al.  A General Differentiable Mesh Renderer for Image-Based 3D Reasoning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Garrick Brazil,et al.  GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Huaici Zhao,et al.  RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving , 2020, ECCV.

[35]  Hao Li,et al.  Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  James M. Rehg,et al.  3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Ming Liu,et al.  Ground-Aware Monocular 3D Object Detection for Autonomous Driving , 2021, IEEE Robotics and Automation Letters.

[38]  Thierry Chateau,et al.  Deep MANTA: A Coarse-to-Fine Many-Task Network for Joint 2D and 3D Vehicle Analysis from Monocular Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Wanli Ouyang,et al.  Rethinking Pseudo-LiDAR Representation , 2020, ECCV.

[40]  Ruigang Yang,et al.  ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Tatsuya Harada,et al.  Neural 3D Mesh Renderer , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Steven L. Waslander,et al.  Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Shaojie Shen,et al.  Stereo R-CNN Based 3D Object Detection for Autonomous Driving , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Lu Xiong,et al.  MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Ruigang Yang,et al.  Joint 3D Instance Segmentation and Object Detection for Autonomous Driving , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Zizhang Wu,et al.  SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).