论文信息 - Repulsion Loss: Detecting Pedestrians in a Crowd

Repulsion Loss: Detecting Pedestrians in a Crowd

Detecting individual pedestrians in a crowd remains a challenging problem since the pedestrians often gather together and occlude each other in real-world scenarios. In this paper, we first explore how a state-of-the-art pedestrian detector is harmed by crowd occlusion via experimentation, providing insights into the crowd occlusion problem. Then, we propose a novel bounding box regression loss specifically designed for crowd scenes, termed repulsion loss. This loss is driven by two motivations: the attraction by target, and the repulsion by other surrounding objects. The repulsion term prevents the proposal from shifting to surrounding objects thus leading to more crowd-robust localization. Our detector trained by repulsion loss outperforms the state-of-the-art methods with a significant improvement in occlusion cases.

[1] B. Schiele,et al. How Far are We from Solving Pedestrian Detection? , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Yuning Jiang,et al. What Can Help Pedestrian Detection? , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Bernt Schiele,et al. Taking a deeper look at pedestrians , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Pietro Perona,et al. Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Rogério Schmidt Feris,et al. A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[6] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[7] Chunluan Zhou,et al. Multi-label Learning of Part Detectors for Heavily Occluded Pedestrian Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9] Pietro Perona,et al. Integral Channel Features , 2009, BMVC.

[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Yi Li,et al. Fully Convolutional Instance-Aware Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Yi Yang,et al. DenseBox: Unifying Landmark Localization with End to End Object Detection , 2015, ArXiv.

[13] Xiaogang Wang,et al. A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Bernt Schiele,et al. Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Abhinav Gupta,et al. Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Charless C. Fowlkes,et al. Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Xiaogang Wang,et al. Deep Learning Strong Parts for Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24] Yuning Jiang,et al. UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[25] Bernt Schiele,et al. Learning Non-maximum Suppression , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27] Shuicheng Yan,et al. Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[28] Pietro Perona,et al. Pedestrian detection: A benchmark , 2009, CVPR.

[29] Bernt Schiele,et al. CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[31] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[32] Joon Hee Han,et al. Local Decorrelation For Improved Detection , 2014, ArXiv.

[33] Bin Yang,et al. Convolutional Channel Features , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34] Liang Lin,et al. Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.