Enhanced Object Detection With Deep Convolutional Neural Networks for Advanced Driving Assistance

Object detection is a critical problem for advanced driving assistance systems (ADAS). Recently, convolutional neural networks (CNN) achieved large successes on object detection, with performance improvement over traditional approaches, which use hand-engineered features. However, due to the challenging driving environment (e.g., large object scale variation, object occlusion, and bad light conditions), popular CNN detectors do not achieve very good object detection accuracy over the KITTI autonomous driving benchmark dataset. In this paper, we propose three enhancements for CNN-based visual object detection for ADAS. To address the large object scale variation challenge, deconvolution and fusion of CNN feature maps are proposed to add context and deeper features for better object detection at low feature map scales. In addition, soft non-maximal suppression (NMS) is applied across object proposals at different feature scales to address the object occlusion challenge. As the cars and pedestrians have distinct aspect ratio features, we measure their aspect ratio statistics and exploit them to set anchor boxes properly for better object matching and localization. The proposed CNN enhancements are evaluated with various image input sizes by experiments over KITTI dataset. The experimental results demonstrate the effectiveness of the proposed enhancements with good detection performance over KITTI test set.

[1]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[2]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[3]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Jana Kosecka,et al.  3D Bounding Box Estimation Using Deep Learning and Geometry , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[7]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[12]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[13]  Sanja Fidler,et al.  Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[16]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Fatih Murat Porikli,et al.  Fast Detection of Multiple Objects in Traffic Scenes With a Common Detection Framework , 2015, IEEE Transactions on Intelligent Transportation Systems.

[18]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[19]  Yu-Wing Tai,et al.  Accurate Single Stage Detector Using Recurrent Rolling Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Thierry Chateau,et al.  Deep MANTA: A Coarse-to-Fine Many-Task Network for Joint 2D and 3D Vehicle Analysis from Monocular Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[24]  Silvio Savarese,et al.  Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[26]  Ji Wan,et al.  Multi-view 3D Object Detection Network for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[28]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Mohan M. Trivedi,et al.  Looking at Pedestrians at Different Scales: A Multiresolution Approach and Evaluations , 2016, IEEE Transactions on Intelligent Transportation Systems.

[30]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Shuai Su,et al.  A Graph-Based Vehicle Proposal Location and Detection Algorithm , 2017, IEEE Transactions on Intelligent Transportation Systems.

[32]  Nojun Kwak,et al.  Enhancement of SSD by concatenating feature maps for object detection , 2017, BMVC.

[33]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[35]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.