Occlusion Problem-Oriented Adversarial Faster-RCNN Scheme

In the practical scene, object detection faces a very complicated situation. The occlusion problem always occurs in actual scene, which may affect the accuracy of object detection, especially for the occluded objects. For the deep models, a larger dataset with sufficient occlusion samples will improve the performance of the object detection models. However, the sample with occlusion problem is too hard to obtain. Therefore, a global average pooling(GAP) based adversarial Faster-RCNN is proposed to generate the hard samples and enhance the performance of object detection algorithm. Sufficient hard samples can be generated with the help of this model. Therefore, the object detection model can be trained adequately for the occluded objects. The hard sample generation is carried out in the space of image feature instead of image generation directly. The class-dependent part is obtained by the GAP network, and it is obscured to generate the feature map of hard sample for model reinforcement training. Therefore, the better object detection model can be trained using a conventional dataset. The Faster-RCNN is adopted as the baseline. The Faster-RCNN and GAP have a joint training to improve the performance of the proposed model. The simulation results exhibit the validation of the proposed algorithm.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[3]  Jian Yang,et al.  Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[5]  Ram Nevatia,et al.  Detection and Segmentation of Multiple, Partially Occluded Objects by Grouping, Merging, Assigning Part Detection Responses , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Bernt Schiele,et al.  Detection and Tracking of Occluded People , 2014, International Journal of Computer Vision.

[7]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[8]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Larry S. Davis,et al.  G-CNN: An Iterative Grid Based Object Detector , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Fan Yang,et al.  Single Shot Multibox Detector With Kalman Filter for Online Pedestrian Detection in Video , 2019, IEEE Access.

[12]  Abhinav Gupta,et al.  A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[14]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[15]  Lianli Gao,et al.  Traffic sign detection and recognition based on pyramidal convolutional networks , 2019, Neural Computing and Applications.

[16]  Peter V. Gehler,et al.  Occlusion Patterns for Object Class Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jimin Liang,et al.  A Part-Based Probabilistic Model for Object Detection with Occlusion , 2014, PloS one.

[18]  Jian Sun,et al.  Object Detection Networks on Convolutional Feature Maps , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Silvio Savarese,et al.  Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[21]  Yuning Jiang,et al.  Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[25]  Bernt Schiele,et al.  Learning People Detectors for Tracking in Crowded Scenes , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Jun Sun,et al.  Cloud and Cloud Shadow Detection Using Multilevel Feature Fused Segmentation Network , 2018, IEEE Geoscience and Remote Sensing Letters.

[27]  Xiaogang Wang,et al.  Partial Occlusion Handling in Pedestrian Detection With a Deep Model , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Jun Chu,et al.  Object Detection Based on Multi-Layer Convolution Feature Fusion and Online Hard Example Mining , 2018, IEEE Access.

[29]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yoshua Bengio,et al.  Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.

[31]  Christoph Schnörr,et al.  A Study of Parts-Based Object Class Detection Using Complete Graphs , 2010, International Journal of Computer Vision.

[32]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[34]  Wenbin Li,et al.  Robust object tracking with occlusion handle , 2011, Neural Computing and Applications.

[35]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  David Vázquez,et al.  Occlusion Handling via Random Subspace Classifiers for Human Detection , 2014, IEEE Transactions on Cybernetics.

[37]  Xiaogang Wang,et al.  Joint Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[38]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Sambit Bakshi,et al.  An Evaluation of Background Subtraction for Object Detection Vis-a-Vis Mitigating Challenging Scenarios , 2016, IEEE Access.

[40]  Dariu Gavrila,et al.  Multi-cue pedestrian classification with partial occlusion handling , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  Shifeng Zhang,et al.  Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd , 2018, ECCV.

[42]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.