Unsupervised Deep Domain Adaptation for Pedestrian Detection

This paper addresses the problem of unsupervised domain adaptation on the task of pedestrian detection in crowded scenes. First, we utilize an iterative algorithm to iteratively select and auto-annotate positive pedestrian samples with high confidence as the training samples for the target domain. Meanwhile, we also reuse negative samples from the source domain to compensate for the imbalance between the amount of positive samples and negative samples. Second, based on the deep network we also design an unsupervised regularizer to mitigate influence from data noise. More specifically, we transform the last fully connected layer into two sub-layers — an element-wise multiply layer and a sum layer, and add the unsupervised regularizer to further improve the domain adaptation accuracy. In experiments for pedestrian detection, the proposed method boosts the recall value by nearly \(30\,\%\) while the precision stays almost the same. Furthermore, we perform our method on standard domain adaptation benchmarks on both supervised and unsupervised settings and also achieve state-of-the-art results.

[1]  Sumit Chopra,et al.  DLID: Deep Learning for Domain Adaptation by Interpolating between Domains , 2013 .

[2]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[6]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[7]  Karsten M. Borgwardt,et al.  Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[8]  Mengjie Zhang,et al.  Domain Adaptive Neural Networks for Object Recognition , 2014, PRICAI.

[9]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[10]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[11]  Meng Wang,et al.  Deep Learning of Scene-Specific Classifier for Pedestrian Detection , 2014, ECCV.

[12]  Takeo Kanade,et al.  Learning scene-specific pedestrian detectors without real data , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Meng Wang,et al.  Scene-Specific Pedestrian Detection for Static Video Surveillance , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Andrew Y. Ng,et al.  End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Barbara Caputo,et al.  Frustratingly Easy NBNN Domain Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[18]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[19]  Yoshua Bengio,et al.  Unsupervised and Transfer Learning Challenge: a Deep Learning Approach , 2011, ICML Unsupervised and Transfer Learning.

[20]  Bernt Schiele,et al.  Learning people detection models from few training samples , 2011, CVPR 2011.

[21]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[22]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.