A corpus for benchmarking of people detection algorithms

This paper describes a corpus, dataset and associated ground-truth, for the evaluation of people detection algorithms in surveillance video scenarios, along with the design procedure followed to generate it. Sequences from scenes with different levels of complexity have been manually annotated. Each person present at a scene has been labeled frame by frame, in order to automatically obtain a people detection ground-truth for each sequence. Sequences have been classified into different complexity categories depending on critical factors that typically affect the behavior of detection algorithms. The resulting corpus, which exceeds other public pedestrian datasets in the amount of video sequences and its complexity variability, is freely available for benchmarking and research purposes under a license agreement.

[1]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Rita Cucchiara,et al.  Annotation Collection and Online Performance Evaluation for Video Surveillance: The ViSOR Project , 2008, 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance.

[4]  Kikuo Fujimura,et al.  Human detection using depth and gray images , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[5]  Álvaro García-Martín,et al.  Robust Real Time Moving People Detection in Surveillance Scenarios , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[6]  Álvaro García-Martín,et al.  People detection based on appearance and motion models , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[7]  Sergio A. Velastin,et al.  Intelligent distributed surveillance systems: a review , 2005 .

[8]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[10]  Osama Masoud,et al.  Estimating pedestrian counts in groups , 2008, Comput. Vis. Image Underst..

[11]  Bernt Schiele,et al.  Multi-cue onboard pedestrian detection , 2009, CVPR.

[12]  José María Martínez Sanchez,et al.  A ground truth for motion-based video-object segmentation , 2008, 2008 15th IEEE International Conference on Image Processing.

[13]  François Brémond,et al.  ETISEO, performance evaluation for video surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[14]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Pietro Perona,et al.  Pedestrian detection: A benchmark , 2009, CVPR.

[16]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  James Ferryman,et al.  Proceedings of the thirteenth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance , 2009 .

[19]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.