Counting People Crossing a Line Using Integer Programming and Local Features

We propose an integer programming method for estimating the instantaneous count of pedestrians crossing a line of interest (LOI) in a video sequence. Through a line sampling process, the video is first converted into a temporal slice image. Next, the number of people is estimated in a set of overlapping sliding windows on the temporal slice image, using a regression function that maps from local features to a count. Given that the count in a sliding window is the sum of the instantaneous counts in the corresponding time interval, an integer programming method is proposed to recover the number of pedestrians crossing the LOI in each frame. Integrating over a specific time interval yields the cumulative count of pedestrians crossing the line. Compared with current methods for line counting, our proposed approach achieves state-of-the-art performance on several challenging crowd video data sets.

[1]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  J. Weickert,et al.  Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[3]  Yandong Tang,et al.  Flow mosaicking: Real-time pedestrian counting without scene-specific learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Antoni B. Chan,et al.  Crossing the Line: Crowd Counting by Integer Programming with Local Features , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Sridha Sridharan,et al.  Crowd Counting Using Multiple Local Features , 2009, 2009 Digital Image Computing: Techniques and Applications.

[6]  Shaogang Gong,et al.  From Semi-supervised to Transfer Counting of Crowds , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Nuno Vasconcelos,et al.  Bayesian Poisson regression for crowd counting , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Benjamin Z. Yao,et al.  Introduction to a Large-Scale General Purpose Ground Truth Database: Methodology, Annotation Tool and Benchmarks , 2007, EMMCVPR.

[9]  Hanqing Lu,et al.  Spatiotemporal Group Context for Pedestrian Counting , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Ramakant Nevatia,et al.  Camera calibration from video of a walking human , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[13]  Xiaogang Wang,et al.  Understanding collective crowd behaviors: Learning a Mixture model of Dynamic pedestrian-Agents , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Shaogang Gong,et al.  Cumulative Attribute Space for Age and Crowd Density Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Hai Tao,et al.  A Viewpoint Invariant Approach for Crowd Counting , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[18]  Yandong Tang,et al.  Flow mosaicking: Real-time pedestrian counting without scene-specific learning , 2009, CVPR.

[19]  Hong-Yuan Mark Liao,et al.  Cross-Camera Knowledge Transfer for Multiview People Counting , 2015, IEEE Transactions on Image Processing.

[20]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[21]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[22]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[23]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.