LCrowdV: Generating labeled videos for pedestrian detectors training and crowd behavior learning

Abstract We present a novel procedural framework to generate an arbitrary number of labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to design accurate algorithms or training models for crowded scene understanding. Our overall approach is composed of two components: a procedural simulation framework to generate crowd movements and behaviors, and a procedural rendering framework to generate different videos or images. Each video or image is automatically labeled based on the environment, number of pedestrians, density, behavior (personality), flow, lighting conditions, viewpoint, type of noise, etc. Furthermore, we can increase the realism by combining synthetically-generated behaviors with real-world background videos. We demonstrate the benefits of LCrowdV over prior lableled crowd datasets by augmenting a real dataset with it and improving the accuracy in pedestrian detection and crowd classification. Furthermore, we evaluate the impact of removing the variety in different LCrowdV parameters to show the importance of the diversity of data generated from our framework. LCrowdV has been made available as an online resource.

[1]  Dinesh Manocha,et al.  Modeling, Simulation and Visual Analysis of Crowds: A Multidisciplinary Perspective , 2013, Modeling, Simulation and Visual Analysis of Crowds.

[2]  Shree K. Nayar,et al.  Light field transfer: global illumination between real and synthetic objects , 2008, SIGGRAPH 2008.

[3]  Ivan Laptev,et al.  Data-driven crowd analysis in videos , 2011, ICCV.

[4]  Paul Oliver Unreal Engine 4 Elemental , 2012, SIGGRAPH '12.

[5]  Xiaogang Wang,et al.  Measuring Crowd Collectiveness , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Dinesh Manocha,et al.  Simulating heterogeneous crowd behaviors using personality trait theory , 2011, SCA '11.

[7]  Qi Wang,et al.  Deep Metric Learning for Crowdedness Regression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Tsong-Yi Chen,et al.  An Intelligent People-Flow Counting Method for Passing Through a Gate , 2006, 2006 IEEE Conference on Robotics, Automation and Mechatronics.

[9]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[12]  Edward J. Delp,et al.  Crowd flow estimation using multiple visual features for scenes with changing crowd densities , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[13]  Takeo Kanade,et al.  Learning scene-specific pedestrian detectors without real data , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Miguel Torres-Torriti,et al.  A density-based approach for effective pedestrian counting at bus stops , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[15]  Mubarak Shah,et al.  Identifying Behaviors in Crowd Scenes Using Stability Analysis for Dynamical Systems , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.

[17]  Ramin Mehran,et al.  Abnormal crowd behavior detection using social force model , 2009, CVPR.

[18]  Tal Hassner,et al.  Violent flows: Real-time detection of violent crowd behavior , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Bingbing Ni,et al.  Crowded Scene Analysis: A Survey , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Serge J. Belongie,et al.  Counting Crowded Moving Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Dinesh Manocha,et al.  Realtime Anomaly Detection Using Trajectory-Level Crowd Behavior Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Nenghai Yu,et al.  Crowd Tracking with Dynamic Evolution of Group Structures , 2014, ECCV.

[23]  Jean-Philippe Thiran,et al.  Counting Pedestrians in Video Sequences Using Trajectory Clustering , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Mubarak Shah,et al.  Floor Fields for Tracking in High Density Crowd Scenes , 2008, ECCV.

[25]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Dinesh Manocha,et al.  Menge: A Modular Framework for Simulating Crowd Movement , 2016 .

[27]  John Funge,et al.  Cognitive modeling: knowledge, reasoning and planning for intelligent characters , 1999, SIGGRAPH.

[28]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Mubarak Shah,et al.  A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Xiaogang Wang,et al.  Understanding collective crowd behaviors: Learning a Mixture model of Dynamic pedestrian-Agents , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Mark H. Overmars,et al.  Using the Corridor Map Method for Path Planning for a Large Number of Characters , 2008, MIG.

[32]  Peter V. Gehler,et al.  Teaching 3D geometry to deformable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  David Vázquez,et al.  Learning appearance in virtual scenarios for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  G. L. Bon,et al.  Scientific Literature: The Crowd. A Study of the Popular Mind , 1897 .

[35]  Yaser Sheikh,et al.  3D Pose-by-Detection of Vehicles via Discriminatively Reduced Ensembles of Correlation Filters , 2014, BMVC.

[36]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  L. A. Pervin Science of Personality , 1942 .

[38]  Xiaogang Wang,et al.  Scene-Independent Group Profiling in Crowd , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Arjan Egges,et al.  A hybrid interpolation scheme for footprint-driven walking synthesis , 2011, Graphics Interface.

[40]  Dinesh Manocha,et al.  Reciprocal n-Body Collision Avoidance , 2011, ISRR.

[41]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[44]  Jean-Claude Latombe,et al.  Robot Motion Planning: A Distributed Representation Approach , 1991, Int. J. Robotics Res..

[45]  Dinesh Manocha,et al.  PLEdestrians: a least-effort approach to crowd simulation , 2010, SCA '10.

[46]  Robert B. Fisher,et al.  Modelling Crowd Scenes for Event Detection , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[47]  Shaogang Gong,et al.  A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[48]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[49]  Michel Dhome,et al.  Determination of the Pose of an Articulated Object From a Single Perspective View , 1993, BMVC.

[50]  Daniel Thalmann,et al.  Towards Interactive Real‐Time Crowd Behavior Simulation , 2002, Comput. Graph. Forum.

[51]  Dinesh Manocha,et al.  Classifying Group Emotions for Socially-Aware Autonomous Vehicle Navigation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[52]  Dinesh Manocha,et al.  Interactive Crowd Content Generation and Analysis Using Trajectory-Level Behavior Learning , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[53]  Dinesh Manocha,et al.  MixedPeds: Pedestrian Detection in Unannotated Videos Using Synthetically Generated Human-Agents for Training , 2018, AAAI.

[54]  Dinesh Manocha,et al.  SocioSense: Robot navigation amongst pedestrians with social and psychological constraints , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[55]  Hironobu Fujiyoshi,et al.  A Method for Visualizing Pedestrian Traffic Flow Using SIFT Feature Point Tracking , 2009, PSIVT.

[56]  Shaogang Gong,et al.  Multi-camera activity correlation analysis , 2009, CVPR.

[57]  Lei Meng,et al.  A people counting system based on head-shoulder detection and tracking in surveillance video , 2010, 2010 International Conference On Computer Design and Applications.

[58]  H. Zha,et al.  A fully online and unsupervised system for large and high-density area surveillance: Tracking, semantic scene learning and abnormality detection , 2013, TIST.

[59]  Thomas W. Calvert,et al.  Goal-directed, dynamic animation of human walking , 1989, SIGGRAPH.

[60]  Martial Hebert,et al.  Data-Driven Scene Understanding from 3D Models , 2012, BMVC.

[61]  Hanqing Lu,et al.  Learning Semantic Motion Patterns for Dynamic Scenes by Improved Sparse Topical Coding , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[62]  Kiyoharu Aizawa,et al.  Detecting Dominant Motion Flows in Unstructured/Structured Crowd Scenes , 2010, 2010 20th International Conference on Pattern Recognition.

[63]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Alice Caplier,et al.  Crowd behaviour analysis using histograms of motion direction , 2010, 2010 IEEE International Conference on Image Processing.

[66]  Louis Kratz,et al.  Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models , 2009, CVPR.

[67]  Yi Yang,et al.  Person Re-identification: Past, Present and Future , 2016, ArXiv.

[68]  Dimitris N. Metaxas,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2007) Group Behavior from Video: a Data-driven Approach to Crowd Simulation , 2022 .

[69]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[70]  Demetri Terzopoulos,et al.  Autonomous pedestrians , 2005, SCA '05.

[71]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Stéphane Donikian,et al.  Crowd of Virtual Humans: a New Approach for Real Time Navigation in Complex and Structured Environments , 2004, Comput. Graph. Forum.

[73]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[74]  John James,et al.  The Distribution of Free-Forming Small Group Size , 1953 .

[75]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[76]  Xiaogang Wang,et al.  LCrowdV: Generating Labeled Videos for Simulation-Based Crowd Behavior Learning , 2016, ECCV Workshops.

[77]  Xiaogang Wang,et al.  Deeply learned attributes for crowded scene understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Qi Wang,et al.  Online Anomaly Detection in Crowd Scenes via Structure Analysis , 2015, IEEE Transactions on Cybernetics.