Adaptive pedestrian detection by modulating features in dynamical environment

The accuracy of a trained pedestrian detector is always decreased in a new scenario, if the distributions of the samples in the testing and training scenarios are different. Traditional methods solve this problem based on domain adaption techniques. Unfortunately, most of existing methods need to keep source samples or label target samples in the detection phase. Therefore, they are hard to be applied in the real applications with dynamical environment. For this problem, we propose a feature modulation model, which consists of a Simple Dynamical Neural Network (SDNN) and a Modulating Neural Network (MNN). In SDNN, a dynamical layer is adopt to adaptively weight the feature maps, whose parameters are predicted by MNN. For each candidate proposal, the SDNN generates a proprietary deep feature respectively. Our contributions include 1) the first feature-based unsupervised domain adaptation method which is very suitable for real applications and 2) a new scheme of dynamically weighting feature maps, in which the corresponding training method is also given. Experimental results confirm that our method can achieve the competitive results on two pedestrian datasets.

[1]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Meng Wang,et al.  Deep Learning of Scene-Specific Classifier for Pedestrian Detection , 2014, ECCV.

[4]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Qingming Huang,et al.  Transferring Boosted Detectors Towards Viewpoint and Scene Adaptiveness , 2011, IEEE Transactions on Image Processing.

[6]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Meng Wang,et al.  Scene-Specific Pedestrian Detection for Static Video Surveillance , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Vinod Nair,et al.  An unsupervised, online learning framework for moving object detection , 2004, CVPR 2004.

[9]  Tomaso A. Poggio,et al.  Pedestrian detection using wavelet templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Bingpeng Ma,et al.  A Spatio-Temporal Appearance Representation for Video-Based Pedestrian Re-Identification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Mao Ye,et al.  Adaptive pedestrian detection using convolutional neural network with dynamically adjusted classifier , 2017, J. Electronic Imaging.

[12]  Pei Xu,et al.  Domain adaption of vehicle detector based on convolutional neural networks , 2015, International Journal of Control, Automation and Systems.

[13]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Takeo Kanade,et al.  Learning scene-specific pedestrian detectors without real data , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Meng Wang,et al.  Transferring a generic pedestrian detector towards specific scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Meng Wang,et al.  Automatic adaptation of a generic pedestrian detector to a specific traffic scene , 2011, CVPR 2011.

[18]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[19]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[20]  Shih-Fu Chang,et al.  Cross-domain learning methods for high-level visual concept classification , 2008, 2008 15th IEEE International Conference on Image Processing.

[21]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[22]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Horst Bischof,et al.  Accurate Object Detection with Joint Classification-Regression Random Forests , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.