New Default Box Strategy of SSD for Small Target Detection

SSD, which combines the advantages of Faster-RCNN and YOLO, has excellent performance in both detection speed and precision by merging the default boxes of six different layers. As the original default box strategy cannot accurately capture the small target information, the detection precision of SSD for small target images is not as good as normal size targets. In this paper, a new default box strategy, which can give the appropriate size and number of default boxes, is proposed to improve the performance of SSD for small target detection. The new default box strategy is made up of new scales and new aspect ratios. The new scales, which provide the basic scales for the six layers, are defined by the size ratio of the kernel to the convolutional layer. In addition, the new scale range is reduced from [20, 90] to [20, 60]. The new aspect ratios, which determine the size and the number of default boxes of the six layers, are defined as [[1.1], [1.1], [1.1], [1.1], [0.8, 1.2], [1.1]]. Experiment results on the small ground target dataset show that the detection precision of SSD with the new strategy is 99.5 mAP, which is 4.6 mAP higher than that of the original SSD. More importantly, the training time of SSD with the new strategy is 963 s or 326 s less than that of the original SSD.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Aidong Men,et al.  G-CNN: Object Detection via Grid Convolutional Neural Network , 2017, IEEE Access.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[7]  Raja Syamsul Azmir Raja Abdullah,et al.  LTE‐Based Passive Bistatic Radar System for Detection of Ground‐Moving Targets , 2016 .

[8]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[9]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Geoffrey E. Hinton Where Do Features Come From? , 2014, Cogn. Sci..

[11]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Sungho Kim,et al.  Robust Ground Target Detection by SAR and IR Sensor Fusion Using Adaboost-Based Feature Selection , 2016, Sensors.

[14]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Luc Van Gool,et al.  Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[18]  Yang Yu,et al.  Deep learning-based recognition of underwater target , 2016, 2016 IEEE International Conference on Digital Signal Processing (DSP).

[19]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[20]  Guisheng Liao,et al.  A Ground Moving Target Detection Approach Based on Shadow Feature With Multichannel High-Resolution Synthetic Aperture Radar , 2016, IEEE Geoscience and Remote Sensing Letters.

[21]  Larry S. Davis,et al.  G-CNN: An Iterative Grid Based Object Detector , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[23]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..