论文信息 - DVRNet: Decoupled Visible Region Network for Pedestrian Detection

DVRNet: Decoupled Visible Region Network for Pedestrian Detection

Pedestrian detection remains a challenging task due to the problems caused by occlusion variance. Visible-body bounding boxes are typically used as an extra supervision signal to improve the performance of pedestrian detection to predict the full-body. However, visible-body assisted approaches produce a large number of false positives, which result from a lack of adequate and discriminative full-body contextual information. In this paper, we propose a new network, dubbed DVRNet, based on the representative visible-body assisted pedestrian detector named Bi-box. Specifically, we extend Bi-box by adding three modules named the attention-based feature interleaver module (AFIM), the binary mask learning module (BMLM), and the head-aware feature enhancement module (HFEM), which play important roles in employing features learned by the visible-body and the head supervision signals to enrich high discriminative contextual information of the full-body and enhance the power of feature representation. Experimental results indicate that the DVRNet achieves promising results on the CityPersons and the CrowdHuman datasets.

Lei Shi | Ioannis A. Kakadiaris | Charles Livermore

[1] Chunluan Zhou,et al. Bi-box Regression for Pedestrian Detection and Occlusion Estimation , 2018, ECCV.

[2] Yuning Jiang,et al. Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Shiliang Pu,et al. Small-Scale Pedestrian Detection Based on Topological Line Localization and Temporal Feature Aggregation , 2018, ECCV.

[4] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[5] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[6] Bernt Schiele,et al. CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Fahad Shahbaz Khan,et al. Mask-Guided Attention Network for Occluded Pedestrian Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8] Yunhong Wang,et al. Adaptive NMS: Refining Pedestrian Detection in a Crowd , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Gunhee Kim,et al. Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Shifeng Zhang,et al. Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd , 2018, ECCV.

[11] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Xiangyu Zhang,et al. CrowdHuman: A Benchmark for Detecting Human in a Crowd , 2018, ArXiv.

[13] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Sergio A. Velastin,et al. Local Fisher Discriminant Analysis for Pedestrian Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Wei Liu,et al. High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[18] Yi Jiang,et al. SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition , 2019, J. Mach. Learn. Res..

[19] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Jian Yang,et al. Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] Serestina Viriri,et al. A Survey on Soft Biometrics for Human Identification , 2018 .

[23] Wei Liu,et al. Learning Efficient Single-Stage Pedestrian Detectors by Asymptotic Localization Fitting , 2018, ECCV.

[24] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.