Scene-Specific Pedestrian Detection for Static Video Surveillance

The performance of a generic pedestrian detector may drop significantly when it is applied to a specific scene due to the mismatch between the source training set and samples from the target scene. We propose a new approach of automatically transferring a generic pedestrian detector to a scene-specific detector in static video surveillance without manually labeling samples from the target scene. The proposed transfer learning framework consists of four steps. 1) Through exploring the indegrees from target samples to source samples on a visual affinity graph, the source samples are weighted to match the distribution of target samples. 2) It explores a set of context cues to automatically select samples from the target scene, predicts their labels, and computes confidence scores to guide transfer learning. 3) The confidence scores propagate among target samples according to their underlying visual structures. 4) Target samples with higher confidence scores have larger influence on training scene-specific detectors. All these considerations are formulated under a single objective function called confidence-encoded SVM, which avoids hard thresholding on confidence scores. During test, only the appearance-based detector is used without context cues. The effectiveness is demonstrated through experiments on two video surveillance data sets. Compared with a generic detector, it improves the detection rates by 48 and 36 percent at one false positive per image (FPPI) on the two data sets, respectively. The training process converges after one or two iterations on the data sets in experiments.

[1]  Vinod Nair,et al.  An unsupervised, online learning framework for moving object detection , 2004, CVPR 2004.

[2]  Larry S. Davis,et al.  Hierarchical Part-Template Matching for Human Detection and Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Mubarak Shah,et al.  Online detection and classification of moving objects using progressively improving detectors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[8]  Ramakant Nevatia,et al.  Segmentation and Tracking of Multiple Humans in Crowded Environments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  王晓刚 Single-Pedestrian Detection aided by Multi-pedestrian Detection , 2013 .

[10]  Erik G. Learned-Miller,et al.  Online domain adaptation of a pre-trained cascade of classifiers , 2011, CVPR 2011.

[11]  Ivor W. Tsang,et al.  Domain Transfer SVM for video concept detection , 2009, CVPR 2009.

[12]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[14]  Silvio Savarese,et al.  Cross-view action recognition via view knowledge transfer , 2011, CVPR 2011.

[15]  Daumé,et al.  Frustratingly Easy Semi-Supervised Domain Adaptation , 2010 .

[16]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Paul A. Viola,et al.  Unsupervised improvement of visual detectors using cotraining , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Larry S. Davis,et al.  Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Dariu Gavrila,et al.  Multi-cue pedestrian classification with partial occlusion handling , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, CVPR.

[24]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[25]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[26]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[29]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[30]  Luc Van Gool,et al.  Cascaded Confidence Filtering for Improved Tracking-by-Detection , 2010, ECCV.

[31]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Shih-Fu Chang,et al.  Cross-domain learning methods for high-level visual concept classification , 2008, 2008 15th IEEE International Conference on Image Processing.

[33]  Bernt Schiele,et al.  Multi-cue onboard pedestrian detection , 2009, CVPR.

[34]  A. Leonardis,et al.  On-line Conservative Learning for Person Detection , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[35]  Pietro Perona,et al.  Pedestrian detection: A benchmark , 2009, CVPR.

[36]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Biswajit Bose,et al.  Multi-class object tracking algorithm that handles fragmentation and grouping , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  E. Rückert Detecting Pedestrians by Learning Shapelet Features , 2007 .

[39]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.

[40]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Meng Wang,et al.  Automatic adaptation of a generic pedestrian detector to a specific traffic scene , 2011, CVPR 2011.

[42]  François Fleuret,et al.  FlowBoost — Appearance learning from sparsely annotated video , 2011, CVPR 2011.

[43]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[45]  Tomaso A. Poggio,et al.  Full-body person recognition system , 2003, Pattern Recognit..

[46]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[47]  Rohini K. Srihari,et al.  Incorporating prior knowledge with weighted margin support vector machines , 2004, KDD.

[48]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Charu C. Aggarwal,et al.  Towards cross-category knowledge propagation for learning visual concepts , 2011, CVPR 2011.

[51]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[52]  Horst Bischof,et al.  Classifier grids for robust adaptive object detection , 2009, CVPR.

[53]  Meng Wang,et al.  Transferring a generic pedestrian detector towards specific scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Ramakant Nevatia,et al.  Improving Part based Object Detection by Unsupervised, Online Boosting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Qingming Huang,et al.  Transferring Boosted Detectors Towards Viewpoint and Scene Adaptiveness , 2011, IEEE Transactions on Image Processing.