Combing spatial and temporal features for crowd counting with point supervision

In this paper, we present a new approach to count the number of people that cross a counting line from video images. This paper focuses on point-level annotation in training images and incorporate spatial features along with novel temporal features in training the structured random forest for estimating crowd density. By computing the crowd velocity, we model the crowd counting map as elementwise multiplication of crowd density map and crowd velocity map. Integrating over crowd counting map on the line of interest(LOI) locations leads to the instantaneous LOI counting numbers. We show that results are comparable to those obtained when using more complex and costly techniques.

[1]  Chabane Djeraba,et al.  Spatio-Temporal Optical Flow Analysis for People Counting , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[2]  Cordelia Schmid,et al.  Learning to detect Motion Boundaries , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Reinhard Klette,et al.  Logarithmically Improved Property Regression for Crowd Counting , 2015, PSIVT.

[4]  Kuk-Jin Yoon,et al.  Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  B. Scholl,et al.  “Please Tap the Shape, Anywhere You Like” , 2014, Psychological science.

[6]  Ce Liu,et al.  Exploring new representations and applications for motion analysis , 2009 .

[7]  Yandong Tang,et al.  Flow mosaicking: Real-time pedestrian counting without scene-specific learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Ian D. Reid,et al.  Joint tracking and segmentation of multiple targets , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Thomas Mauthner,et al.  Occlusion Geodesics for Online Multi-object Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Kaiqi Huang,et al.  Large scale crowd analysis based on convolutional neural network , 2015, Pattern Recognit..

[12]  Antoni B. Chan,et al.  Counting People Crossing a Line Using Integer Programming and Local Features , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Yandong Tang,et al.  Flow mosaicking: Real-time pedestrian counting without scene-specific learning , 2009, CVPR.

[14]  Xiaogang Wang,et al.  Crossing-Line Crowd Counting with Two-Phase Deep Neural Networks , 2016, ECCV.

[15]  Ullrich Köthe,et al.  Learning to count with regression forest and structured labels , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).