Spatial Attention Network for Head Detection

Human head detection is widely used in computer vision. However, in practical applications, human head detection is likely to cause false alarms because of the angle, light condition, and cameras. This paper proposes a novel spatial attention network (SAN) which adopts the saliency module to exploit the environmental information beyond the proposal which is ignored in the Faster-RCNN. At the meantime, the class score and saliency score are fused together through a suitable strategy to effectively suppress false positive samples. In order to train and test our model, this paper has established a dataset including 55,802 images. We have evaluated our method and the final experimental results show that our model is significantly superior to the Faster-RCNN model.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ivan Laptev,et al.  Context-Aware CNNs for Person Head Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[4]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[7]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andrew Y. Ng,et al.  End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).