Digital Scene Augmentation Techniques for Generating Photo-Realistic Virtual Crowds

Crowd estimation has a wide range of applications especially in relation to computer vision, robotics and security surveillance technology. New computer vision techniques and deep learning technology have enabled large scale crowd estimation but advances in these techniques have been hindered by the lack of high quality, annotated and publicly available datasets. Although there has been several attempts to compile crowd datasets in the past, collecting and labelling the data is a tedious and labour intensive task. New privacy legislations also make it difficult to release real world footage to the public. In this paper, we present a novel method to generate photo-realistic scalable labelled synthetic crowds for the purpose of accelerating the state-of-the-art in crowd understanding techniques. We generate human models on scene reconstructed environments. The environments are created with footage captured by aerial drone surveys. The crowds are then compo sited with the original images to generate photo-realistic data. This dataset contains 500 high resolution images with over 230,000 annotations and is intended to be publicly available to further advance research in crowd understanding.

[1]  Srinivas S. Kruthiventi,et al.  CrowdNet: A Deep Convolutional Network for Dense Crowd Counting , 2016, ACM Multimedia.

[2]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Mubarak Shah,et al.  A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Jian Zhang,et al.  Learning a perspective-embedded deconvolution network for crowd counting , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[5]  Xiaogang Wang,et al.  Data-Driven Crowd Understanding: A Baseline for a Large-Scale Crowd Dataset , 2016, IEEE Transactions on Multimedia.

[6]  Simone Re,et al.  Ideas and methods for modeling 3D human figures: the principal algorithms used by MakeHuman and their implementation in a new approach to parametric modeling , 2008, Bangalore Compute Conf..

[7]  Bernhard Egger,et al.  Training Deep Face Recognition Systems with Synthetic Data , 2018, ArXiv.

[8]  Bolei Zhou,et al.  Measuring Crowd Collectiveness , 2013, CVPR.

[9]  J. A. Bakar,et al.  The effect of 3D realism and meaning making: A conceptual model , 2014 .

[10]  Xiaogang Wang,et al.  Fully Convolutional Neural Networks for Crowd Segmentation , 2014, ArXiv.

[11]  Lin Chen,et al.  An end-to-end generative adversarial network for crowd counting under complicated scenes , 2017, 2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB).

[12]  Andrew Zisserman,et al.  Counting in the Wild , 2016, ECCV.

[13]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Nicolas Courty,et al.  Using the Agoraset dataset: Assessing for the quality of crowd video analysis methods , 2014, Pattern Recognit. Lett..

[15]  James Ferryman,et al.  Performance evaluation of crowd image analysis using the PETS2009 dataset , 2014, Pattern Recognit. Lett..

[16]  Xiaogang Wang,et al.  Scene-Independent Group Profiling in Crowd , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  David Moloney,et al.  Trinity College Dublin Drone Survey Dataset , 2017 .

[18]  Debra F. Laefer,et al.  Maximizing feature detection in aerial unmanned aerial vehicle datasets , 2017 .

[19]  Lilly Irani,et al.  Amazon Mechanical Turk , 2018, Advances in Intelligent Systems and Computing.