Self-Enhanced R-CNNs for Human Detection With Semi-Supervised Assumptions

Vision-based human detection is a fundamental task in visual content analysis. It has a wide range of applications, especially for person search and retrieval. To reduce the reliance of detection models on large amount of labeled data, we modify Faster R-CNN to facilitate semi-supervised human detection. Specifically, a Reliability Analysis (RA) module is included as an add-on into our Self-Enhanced R-CNN (SE-RCNN) model. The unlabeled images can be pseudo-annotated reliably under the help of this module. As a result, both labeled and unlabeled data are fed simultaneously for model optimization. The additional supervision, in turn, guides the training of a detection module in our model. The two aspects, extracting precise proposals and generating reliable pseudo annotations, can be mutually reinforced. Unlike previous related works, it is the first attempt to build a single-stage semi-supervised human detection model. In our experiment, we observe that the RA module plays an important role in exploiting unlabeled data and leads to state-of-the-art results of SE-RCNN on multiple benchmarks.

[1]  Xiaoming Liu,et al.  Illuminating Pedestrians via Simultaneous Detection and Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Zhiwen Yu,et al.  Semi-Supervised Image Classification With Self-Paced Cross-Task Networks , 2018, IEEE Transactions on Multimedia.

[3]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[4]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[5]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Chunluan Zhou,et al.  Multi-label learning of part detectors for occluded pedestrian detection , 2019, Pattern Recognit..

[7]  Hau-San Wong,et al.  Variant SemiBoost for Improving Human Detection in Application Scenes , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[11]  Lina Yao,et al.  Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph , 2019, IJCAI.

[12]  B. Schiele,et al.  How Far are We from Solving Pedestrian Detection? , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ivor W. Tsang,et al.  Late Fusion via Subspace Search With Consistency Preservation , 2019, IEEE Transactions on Image Processing.

[15]  Gang Wang,et al.  Graininess-Aware Deep Feature Learning for Pedestrian Detection , 2018, ECCV.

[16]  Qingming Huang,et al.  Transferring Boosted Detectors Towards Viewpoint and Scene Adaptiveness , 2011, IEEE Transactions on Image Processing.

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[20]  Р Ю Чуйков,et al.  Обнаружение транспортных средств на изображениях загородных шоссе на основе метода Single shot multibox Detector , 2017 .

[21]  Deyu Meng,et al.  Few-Example Object Detection with Model Communication , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[23]  Meng Wang,et al.  Deep Learning of Scene-Specific Classifier for Pedestrian Detection , 2014, ECCV.

[24]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[25]  Haibin Ling,et al.  Differential Features for Pedestrian Detection: A Taylor Series Perspective , 2019, IEEE Transactions on Intelligent Transportation Systems.

[26]  Fan Yang,et al.  Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Meng Wang,et al.  Scene-Specific Pedestrian Detection for Static Video Surveillance , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Yu Wu,et al.  Progressive Learning for Person Re-Identification With One Example , 2019, IEEE Transactions on Image Processing.

[29]  Jungwon Lee,et al.  Fused DNN: A Deep Neural Network Fusion Approach to Fast and Robust Pedestrian Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Zhenyue Zhang,et al.  Semi-Supervised Domain Adaptation by Covariance Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Chunluan Zhou,et al.  Bi-box Regression for Pedestrian Detection and Occlusion Estimation , 2018, ECCV.

[32]  Yuning Jiang,et al.  Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Jing Xiao,et al.  Detection Evolution with Multi-order Contextual Co-occurrence , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Meng Wang,et al.  Automatic adaptation of a generic pedestrian detector to a specific traffic scene , 2011, CVPR 2011.

[35]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  Thierry Chateau,et al.  Faster R-CNN Scene Specialization with a Sequential Monte-Carlo Framework , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[37]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jian Yang,et al.  Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Joon Hee Han,et al.  Local Decorrelation For Improved Detection , 2014, ArXiv.

[41]  Wei Liu,et al.  Learning Efficient Single-Stage Pedestrian Detectors by Asymptotic Localization Fitting , 2018, ECCV.

[42]  Hau-San Wong,et al.  Exploiting Target Data to Learn Deep Convolutional Networks for Scene-Adapted Human Detection , 2018, IEEE Transactions on Image Processing.

[43]  Vinod Nair,et al.  An unsupervised, online learning framework for moving object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[44]  Chengyang Li,et al.  Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection , 2018, Pattern Recognit..

[45]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[46]  Shifeng Zhang,et al.  Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd , 2018, ECCV.

[47]  Alberto Del Bimbo,et al.  Scene-dependent proposals for efficient person detection , 2019, Pattern Recognit..

[48]  Shengcai Liao,et al.  Robust Multi-resolution Pedestrian Detection in Traffic Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Meng Wang,et al.  Transferring a generic pedestrian detector towards specific scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Qi Tian,et al.  Unsupervised and Semi-Supervised Image Classification With Weak Semantic Consistency , 2019, IEEE Transactions on Multimedia.

[53]  Shifeng Zhang,et al.  WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild , 2019, IEEE Transactions on Multimedia.

[54]  Luc Van Gool,et al.  Cascaded Confidence Filtering for Improved Tracking-by-Detection , 2010, ECCV.

[55]  Jian Zhang,et al.  Feature Affinity-Based Pseudo Labeling for Semi-Supervised Person Re-Identification , 2018, IEEE Transactions on Multimedia.

[56]  Fei He,et al.  Cognitive pedestrian detector: Adapting detector to specific scene by transferring attributes , 2015, Neurocomputing.