Counting people by clustering person detector outputs

We present a people counting system that estimates the number of people in a scene by employing a clustering scheme based on Dirichlet Process Mixture Models (DP-MMs) which takes outputs of a person detector system as input. For each frame, we run a person detector on the frame, take its output as a set of detection areas and define a set of features based on spatial, color and temporal information for each detection. Then using these features, we cluster the detections using DPMMs and Gibbs sampling while having no restriction on the number of clusters, thus can estimate an arbitrary number of people or groups of people. We finally define a measure to calculate the actual number of people within each cluster to infer the final estimation of the number of people in the scene.

[1]  R. Hetherington The Perception of the Visual World , 1952 .

[2]  Antonio Albiol,et al.  VIDEO ANALYSIS USING CORNER MOTION STATISTICS , 2009 .

[3]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[4]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Eric P. Xing,et al.  The Dependent Dirichlet Process Mixture of Objects for Detection-free Tracking and Object Modeling , 2014, AISTATS.

[6]  Robert B. Fisher,et al.  The BEHAVE video dataset: ground truthed video for multi-person behavior classification , 2010 .

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[9]  Adrien Descamps,et al.  Counting People in the Crowd Using a Generic Head Detector , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[10]  Mario Vento,et al.  A Method for Counting Moving People in Video Surveillance Videos , 2010, EURASIP J. Adv. Signal Process..

[11]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[12]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Eric P. Xing,et al.  Parallel Markov Chain Monte Carlo for Nonparametric Mixture Models , 2013, ICML.

[14]  James Ferryman,et al.  Proceedings of the thirteenth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance , 2009 .

[15]  Jean-Philippe Thiran,et al.  Counting Pedestrians in Video Sequences Using Trajectory Clustering , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Erik B. Sudderth Graphical models for visual object recognition and tracking , 2006 .

[17]  R. Vidal,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.