论文信息 - An Efficient Hierarchical Convolutional Neural Network for Traffic Object Detection

An Efficient Hierarchical Convolutional Neural Network for Traffic Object Detection

In this paper, we propose a novel hierarchical convolutional neural network for traffic object detection, which is defined as Fusion and Multi-level Alignment CNN (namely FMLA-CNN). The method extends a popular two-stage detector by incorporating a remodified feature fusion module and a multi-level alignment (MLA) strategy such that it is capable of efficiently detecting multi-scale objects in autonomous driving scenario. The feature fusion strategy in proposal generation network improves detection accuracy by inserting high-level semantics to the whole pyramidal feature hierarchy. Subsequently the MLA strategy in the second detection stage can exactly reserve spatial locations from corresponding feature layers determined by hierarchical region-of-interest proposals. In the experiments on KITTI benchmark, our FMLA-CNN achieves an impressively better trade-off between accuracy and efficiency compared with other state-of-the-art methods.

Ming Yang | Chunxiang Wang | Bing Wang | Qianqian Bi

[1] Fatih Murat Porikli,et al. Fast Detection of Multiple Objects in Traffic Scenes With a Common Detection Framework , 2015, IEEE Transactions on Intelligent Transportation Systems.

[2] Sanja Fidler,et al. Monocular 3D Object Detection for Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Silvio Savarese,et al. Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[4] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[5] Fuqiang Zhou,et al. FSSD: Feature Fusion Single Shot Multibox Detector , 2017, ArXiv.

[6] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[8] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[9] Arthur Daniel Costea,et al. Fast Boosting Based Detection Using Scale Invariant Multimodal Multiresolution Filtered Features , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Rogério Schmidt Feris,et al. A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12] Wei Liu,et al. ParseNet: Looking Wider to See Better , 2015, ArXiv.

[13] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Huimin Ma,et al. 3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[15] Fan Yang,et al. Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[17] Ming Yang,et al. Regionlets for Generic Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[18] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Yu-Wing Tai,et al. Accurate Single Stage Detector Using Recurrent Rolling Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).