Surveillance of Crowded Environments: Modeling the Crowd by Its Global Properties

In this chapter, we consider aspects of the crowd that can be modeled holistically, by analyzing global properties. We first discuss the dynamic texture model for representing holistic motion flow, which treats the video as a sample from a linear dynamical system. By defining appropriate distances and kernels between dynamic textures, crowd motion can be recognized with standard classification algorithms. Besides motion flow, crowd size, i.e., the number of objects within a crowd can also be modeled holistically. From a suitable set of low-level features, crowd counts can be estimated with a regression function that directly maps features into the number of objects within the crowd. In both cases, the surveillance task is solvable by analyzing global scene properties, and there is no need to detect or track individual objects. In result, the solutions tend to be robust even when the crowd is large, there are substantial occlusions, complex object interactions, or the objects are small.

[1]  Mubarak Shah,et al.  A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Mubarak Shah,et al.  A Streakline Representation of Flow in Crowded Scenes , 2010, ECCV.

[3]  Stefano Soatto,et al.  Dynamic Textures , 2003, International Journal of Computer Vision.

[4]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Roberto Cipolla,et al.  Unsupervised Bayesian Detection of Independent Motion in Crowds , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Stefano Soatto,et al.  Spatially Homogeneous Dynamic Textures , 2004, ECCV.

[7]  Dani Lischinski,et al.  Texture Mixing and Texture Movie Synthesis Using Statistical Learning , 2001, IEEE Trans. Vis. Comput. Graph..

[8]  René Vidal,et al.  A System Theoretic Approach to Synthesis and Classification of Lip Articulation , 2007 .

[9]  Richard J. Martin A metric for ARMA processes , 2000, IEEE Trans. Signal Process..

[10]  Narendra Ahuja,et al.  Dynamic Textures Synthesis as Nonlinear Manifold Learning and Traversing , 2006, BMVC.

[11]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[12]  Antoni B. Chan,et al.  Generalized Gaussian process models , 2011, CVPR 2011.

[13]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[14]  Narendra Ahuja,et al.  Phase Based Modelling of Dynamic Textures , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[16]  Stan Sclaroff,et al.  Segmenting foreground objects from a dynamic textured background via a robust Kalman filter , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Nikos Paragios,et al.  A MRF-based approach for real-time subway monitoring , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Yandong Tang,et al.  Flow mosaicking: Real-time pedestrian counting without scene-specific learning , 2009, CVPR.

[19]  Nuno Vasconcelos,et al.  Probabilistic kernels for the classification of auto-regressive visual processes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  Nuno Vasconcelos,et al.  Analysis of Crowded Scenes using Holistic Properties , 2009 .

[23]  René Vidal,et al.  Online Clustering of Moving Hyperplanes , 2006, NIPS.

[24]  Jun Liu,et al.  Spatial Segmentation of Temporal Texture Using Mixture Linear Models , 2006, WDV.

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[27]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[28]  Martin Szummer,et al.  Temporal texture modeling , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[29]  Stefano Soatto,et al.  Recognition of human gaits , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[30]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  David J. Fleet,et al.  Performance of optical flow techniques , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Nuno Vasconcelos,et al.  Classifying Video with Kernel Dynamic Textures , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[35]  Arthur Gelb,et al.  Applied Optimal Estimation , 1974 .

[36]  Dietmar Bauer,et al.  Comparing the CCA Subspace Method to Pseudo Maximum Likelihood Methods in the case of No Exogenous Inputs , 2005 .

[37]  B. De Moor,et al.  Subspace angles between linear stochastic models , 2000, CDC 2000.

[38]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[39]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[40]  Wallace E. Larimore,et al.  Canonical variate analysis in identification, filtering, and adaptive control , 1990, 29th IEEE Conference on Decision and Control.

[41]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[42]  Nuno Vasconcelos,et al.  Layered Dynamic Textures , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[44]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  René Vidal,et al.  Optical flow estimation & segmentation of multiple moving dynamic textures , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[46]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[47]  Nuno Vasconcelos,et al.  Bayesian Poisson regression for crowd counting , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[48]  Sheng-Fuu Lin,et al.  Estimation of number of people in crowded scenes using perspective transformation , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[49]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[50]  Sergio A. Velastin,et al.  Crowd monitoring using image processing , 1995 .

[51]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[52]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[53]  Daniel Cremers,et al.  Dynamic texture segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[54]  A. Marana,et al.  On the efficacy of texture analysis for crowd monitoring , 1998, Proceedings SIBGRAPI'98. International Symposium on Computer Graphics, Image Processing, and Vision (Cat. No.98EX237).

[55]  Stefano Soatto,et al.  Dynamic Shape and Appearance Models , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[57]  Tommy W. S. Chow,et al.  A neural-based crowd estimation by hybrid global learning algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[58]  Mubarak Shah,et al.  Video Scene Understanding Using Multi-scale Analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[59]  Bart De Moor,et al.  N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems , 1994, Autom..

[60]  A. Fitzgibbon Stochastic rigidity: image registration for nowhere-static scenes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[61]  Sabine Süsstrunk,et al.  Higher Order SVD Analysis for Dynamic Texture Synthesis , 2008, IEEE Transactions on Image Processing.

[62]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Gregory D. Hager,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.

[64]  René Vidal,et al.  View-invariant dynamic texture recognition using a bag of dynamical systems , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Luciano da Fontoura Costa,et al.  Estimating crowd density with Minkowski fractal dimension , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[66]  Nikos Paragios,et al.  Background modeling and subtraction of dynamic scenes , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[67]  Mubarak Shah,et al.  Learning motion patterns in crowded scenes using motion flow field , 2008, 2008 19th International Conference on Pattern Recognition.

[68]  René Vidal,et al.  Video Registration Using Dynamic Textures , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Antoni B. Chan Beyond dynamic textures : a family of stochastic dynamical models for video with applications to computer vision , 2008 .

[70]  Andrew W. Fitzgibbon,et al.  Shift-Invariant Dynamic Texture Recognition , 2006, ECCV.

[71]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[72]  Luc Van Gool,et al.  Coupled Detection and Trajectory Estimation for Multi-Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[73]  René Vidal,et al.  Segmenting Dynamic Textures with Ising Descriptors, ARX Models and Level Sets , 2006, WDV.

[74]  Mubarak Shah,et al.  Scene understanding by statistical modeling of motion patterns , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[75]  Mubarak Shah,et al.  Detecting global motion patterns in complex videos , 2008, 2008 19th International Conference on Pattern Recognition.

[76]  Visvanathan Ramesh,et al.  Fast Crowd Segmentation Using Shape Indexing , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[77]  Serge J. Belongie,et al.  Counting Crowded Moving Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[78]  René Vidal,et al.  DynamicBoost: Boosting Time Series Generated by Dynamical Systems , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[79]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[80]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[81]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[82]  Harry Shum,et al.  Synthesizing Dynamic Texture with Closed-Loop Linear Dynamic System , 2004, ECCV.

[83]  Byron Boots,et al.  A Constraint Generation Approach to Learning Stable Linear Dynamical Systems , 2007, NIPS.

[84]  Alexander J. Smola,et al.  Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes , 2007, International Journal of Computer Vision.

[85]  Randal C. Nelson,et al.  Recognition of motion from temporal texture , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[86]  Hai Tao,et al.  Counting Pedestrians in Crowds Using Viewpoint Invariant Training , 2005, BMVC.

[87]  Carlo S. Regazzoni,et al.  Distributed data fusion for real-time crowding estimation , 1996, Signal Process..

[88]  Payam Saisan,et al.  Dynamic texture recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[89]  Nuno Vasconcelos,et al.  Variational layered dynamic textures , 2009, CVPR.

[90]  BlakeAndrew,et al.  C ONDENSATION Conditional Density Propagation forVisual Tracking , 1998 .

[91]  Nuno Vasconcelos,et al.  Generalized Stauffer–Grimson background subtraction for dynamic scenes , 2011, Machine Vision and Applications.