Predicting Pedestrian Counts in Crowded Scenes With Rich and High-Dimensional Features

Estimating the number of pedestrians in surveillance images and videos has important applications in intelligent transportation systems. This problem is particularly challenging when the scenes are densely crowded, in which the techniques of tracking a single pedestrian has limited effectiveness. Alternative approaches employ statistical learning algorithms to infer pedestrian counts directly from visual features computed on images or scenes. In this paper, we describe a system for predicting pedestrian counts that significantly extends the utility of those ideas. Our approach incorporates a richer set of features for statistical modeling. While these features give rise to regression problems in a high-dimensional space, we leverage learning techniques to reduce dimensionality while still attaining high accuracy for predicting the number of pedestrians. Empirical results have validated our strategy. Specifically, our system outperforms state-of-the-art methods on standard benchmark tasks by a large margin.

[1]  L. Li,et al.  On pixel count based crowd density estimation for visual surveillance , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[2]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[3]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4]  Yang Yu,et al.  Ensembling local learners ThroughMultimodal perturbation , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[5]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Michael I. Jordan,et al.  Kernel dimension reduction in regression , 2009, 0908.1854.

[7]  Yi-Chang Chiu,et al.  Online Behavior-Robust Feedback Information Routing Strategy for Mass Evacuation , 2008, IEEE Transactions on Intelligent Transportation Systems.

[8]  Kien A. Hua,et al.  Dynamic Plan Generation and Real-Time Management Techniques for Traffic Evacuation , 2008, IEEE Transactions on Intelligent Transportation Systems.

[9]  Luciano da Fontoura Costa,et al.  Estimating crowd density with Minkowski fractal dimension , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[10]  Fei-Yue Wang,et al.  Data-Driven Intelligent Transportation Systems: A Survey , 2011, IEEE Transactions on Intelligent Transportation Systems.

[11]  Thierry Chateau,et al.  Pedestrian Detection and Tracking in an Urban Environment Using a Multilayer Laser Scanner , 2010, IEEE Transactions on Intelligent Transportation Systems.

[12]  Sergio A. Velastin,et al.  Crowd monitoring using image processing , 1995 .

[13]  Yili Liu,et al.  Investigation of Driver Performance With Night Vision and Pedestrian Detection Systems—Part I: Empirical Study on Visual Clutter and Glance Behavior , 2010, IEEE Transactions on Intelligent Transportation Systems.

[14]  ZuWhan Kim Real time object tracking based on dynamic feature grouping with background subtraction , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Yan Chen,et al.  Classifying image texture with statistical landscape features , 2006, Pattern Analysis and Applications.

[16]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  David Murakami Wood,et al.  The Growth of CCTV: a global perspective on the international diffusion of video surveillance in publicly accessible space , 2002 .

[18]  A. Marana,et al.  On the efficacy of texture analysis for crowd monitoring , 1998, Proceedings SIBGRAPI'98. International Symposium on Computer Graphics, Image Processing, and Vision (Cat. No.98EX237).

[19]  Visvanathan Ramesh,et al.  Fast Crowd Segmentation Using Shape Indexing , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Zhi-Hua Zhou,et al.  Ensembling local learners ThroughMultimodal perturbation , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[22]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.