Improvement of Real Time Detection Algorithm Based on SSD

Since the convolution neural network models emerged, convolution neural network model is more and more large, which brings the improvement of model effect, but its heavy computational load and huge memory make the model difficult to deploy in the embedded system. In this paper, an improved scheme is proposed based on the Single Shot Detector (SSD) network model. A network with a small amount of parameters, which is named Wide Residual Network (WRN), takes place of the original feature extraction network. What's more, input size of the network is reduced to reduce the computational load. In order to compensate for the loss of accuracy caused by reducing the input size of the network and solve the problem of mismatch between positive and negative samples in training samples, Focal Loss' loss function is adopted in training objectives, which makes the model training more focused on difficult samples. Experiments show that the model achieve mAP 0.781 on VOC0712. At the same time, it reached 89FPS on the GPU K80.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[10]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.