Scene invariant multi camera crowd counting

Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.

[1]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[2]  Simon Denman,et al.  Improved detection and tracking of objects in surveillance video , 2009 .

[3]  Antonio Albiol,et al.  Statistical video analysis for crowds counting , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[4]  Sridha Sridharan,et al.  Improved Simultaneous Computation of Motion Detection and Optical Flow for Object Tracking , 2009, 2009 Digital Image Computing: Techniques and Applications.

[5]  Sridha Sridharan,et al.  Scene Invariant Crowd Counting and Crowd Occupancy Analysis , 2012, Video Analytics for Business Intelligence.

[6]  Sridha Sridharan,et al.  Crowd Counting Using Group Tracking and Local Features , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[7]  Hoai Bac Le,et al.  GPU Implementation of Extended Gaussian Mixture Model for Background Subtraction , 2010, 2010 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF).

[8]  Nuno Vasconcelos,et al.  Analysis of Crowded Scenes using Holistic Properties , 2009 .

[9]  Sridha Sridharan,et al.  Scene Invariant Crowd Counting , 2011, 2011 International Conference on Digital Image Computing: Techniques and Applications.

[10]  Alan Hanjalic,et al.  Towards a Robust Solution to People Counting , 2006, 2006 International Conference on Image Processing.

[11]  Mario Vento,et al.  A Method for Counting Moving People in Video Surveillance Videos , 2010, EURASIP J. Adv. Signal Process..

[12]  Vittorio Murino,et al.  A real-time vision system for crowding monitoring , 1993, Proceedings of IECON '93 - 19th Annual Conference of IEEE Industrial Electronics.

[13]  Osama Masoud,et al.  Estimating pedestrian counts in groups , 2008, Comput. Vis. Image Underst..

[14]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Luciano da Fontoura Costa,et al.  Estimating crowd density with Minkowski fractal dimension , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[17]  Sridha Sridharan,et al.  An adaptive optical flow technique for person tracking systems , 2007, Pattern Recognit. Lett..

[18]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Sheng-Fuu Lin,et al.  Estimation of number of people in crowded scenes using perspective transformation , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[20]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  W. Grimson,et al.  Ground Plane Rectification by Tracking Moving Objects , 2003 .

[22]  Hai Tao,et al.  A Viewpoint Invariant Approach for Crowd Counting , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[23]  H. M. Karara,et al.  Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-Range Photogrammetry , 2015 .

[24]  L. Li,et al.  On pixel count based crowd density estimation for visual surveillance , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[25]  Nuno Vasconcelos,et al.  Bayesian Poisson regression for crowd counting , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Paulo R. S. Mendonça,et al.  Bayesian autocalibration for surveillance , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Sergio A. Velastin,et al.  Crowd monitoring using image processing , 1995 .

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Sridha Sridharan,et al.  Crowd Counting Using Multiple Local Features , 2009, 2009 Digital Image Computing: Techniques and Applications.

[30]  Ramakant Nevatia,et al.  Self-calibration of a camera from video of a walking human , 2002, Object recognition supported by user interaction for service robots.

[31]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[32]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[33]  Y. I. Abdel-Aziz Direct linear transformation from comparator coordinates in close-range photogrammetry , 1971 .

[34]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[35]  Visvanathan Ramesh,et al.  Fast Crowd Segmentation Using Shape Indexing , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[36]  M. Nixon,et al.  On crowd density estimation for surveillance , 2006 .

[37]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[38]  A. Marana,et al.  Estimation of crowd density using image processing , 1997 .

[39]  R. Y. Tsai,et al.  An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision , 1986, CVPR 1986.

[40]  Nikos Paragios,et al.  A MRF-based approach for real-time subway monitoring , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[41]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..