Attribute-aware Pedestrian Detection in a Crowd

Pedestrian detection is an initial step to perform outdoor scene analysis, which plays an essential role in many real-world applications. Although having enjoyed the merits of deep learning frameworks from the generic object detectors, pedestrian detection is still a very challenging task due to heavy occlusion and highly crowded group. Generally, the conventional detectors are unable to differentiate individuals from each other effectively under such a dense environment. To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion. Besides the typical semantic features, center position, target's scale and offset, we introduce a pedestrian-oriented attribute feature to encode the high-level semantic differences among the crowd. Moreover, a novel attribute-feature-based Non-Maximum Suppression~(NMS) is proposed to distinguish the person from a highly overlapped group by adaptively rejecting the false-positive results in a very crowd settings. Furthermore, a novel ground truth target is designed to alleviate the difficulties caused by the attribute configuration and extremely class imbalance issues during training. Finally, we evaluate our proposed attribute-aware pedestrian detector on two benchmark datasets including CityPersons and CrowdHuman. The experimental results show that our approach outperforms state-of-the-art methods at a large margin on pedestrian detection.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[3]  Chunluan Zhou,et al.  Bi-box Regression for Pedestrian Detection and Occlusion Estimation , 2018, ECCV.

[4]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[5]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[7]  Shifeng Zhang,et al.  Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd , 2018, ECCV.

[8]  Xiaogang Wang,et al.  Pedestrian detection aided by deep learning semantic tasks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[11]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[12]  Xiaogang Wang,et al.  Deep Learning Strong Parts for Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Trevor Darrell,et al.  Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Yuning Jiang,et al.  What Can Help Pedestrian Detection? , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Wei Liu,et al.  High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yichen Wei,et al.  Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[19]  Hui Zhou,et al.  Pedestrian Detection via Body Part Semantic and Contextual Information With DNN , 2018, IEEE Transactions on Multimedia.

[20]  Wei Liu,et al.  Learning Efficient Single-Stage Pedestrian Detectors by Asymptotic Localization Fitting , 2018, ECCV.

[21]  Yunhong Wang,et al.  Adaptive NMS: Refining Pedestrian Detection in a Crowd , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[25]  Xiangyu Zhang,et al.  CrowdHuman: A Benchmark for Detecting Human in a Crowd , 2018, ArXiv.

[26]  Yuning Jiang,et al.  Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[32]  Bernt Schiele,et al.  Learning Non-maximum Suppression , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Lars Petersson,et al.  Improving Object Localization with Fitness NMS and Bounded IoU Loss , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[35]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Donghoon Lee,et al.  Individualness and Determinantal Point Processes for Pedestrian Detection , 2016, ECCV.

[37]  Shuicheng Yan,et al.  Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[38]  Joon Hee Han,et al.  Local Decorrelation For Improved Pedestrian Detection , 2014, NIPS.

[39]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[40]  Shiliang Pu,et al.  Small-Scale Pedestrian Detection Based on Topological Line Localization and Temporal Feature Aggregation , 2018, ECCV.

[41]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Robert T. Collins,et al.  Optimized Pedestrian Detection for Multiple and Occluded People , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Luc Van Gool,et al.  Handling Occlusions with Franken-Classifiers , 2013, 2013 IEEE International Conference on Computer Vision.

[46]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Jian Dong,et al.  Attentive Contexts for Object Detection , 2016, IEEE Transactions on Multimedia.

[48]  Chunluan Zhou,et al.  Multi-label Learning of Part Detectors for Heavily Occluded Pedestrian Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Hieu Le,et al.  Iterative Crowd Counting , 2018, ECCV.