3D-like Bounding Box for Vehicle Detection

In this work, we propose a 3D-like bounding box to detect vehicles in RGB image based on the 2D bounding box detection. Taking advantage of key-points detection by detecting eight corners of the 3D box surrounding the vehicle, our 3D-like bounding box is capable of estimating the vehicle position more accurately than traditional object detection methods, while provides the information of the front and back of the vehicle or even the orientation that is of great help in advanced driver assistance system. We present two improved strategies for data annotation to evaluate the proposed method. The experiments on KITTI object detection database show that our presented two strategies exhibit promising performance.

[1]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jitendra Malik,et al.  Using k-Poselets for Detecting People and Localizing Their Keypoints , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yuting Zhang,et al.  Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[8]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[9]  Xiaogang Wang,et al.  Multi-source Deep Learning for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jonathan Tompson,et al.  Efficient object localization using Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[15]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[18]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[22]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).