Large-Scale Synthetic Urban Dataset for Aerial Scene Understanding

The geometric extraction and semantic understanding in bird’s eye view plays an important role in cyber-physical-social systems (CPSS), because it can help human or intelligent agents (IAs) to perceive larger range of environment. Moreover, due to lack of comprehensive dataset from oblique perspective, fog-end deep learning algorithms for this purpose is still in blank. In this paper, we propose a novel method to generate synthetic large-scale dataset for geometric and semantic urban scene understanding from bird’s eye view. There are two main steps involved, one is modeling and the other is rendering, which are processed by CityEngine and UnrealEngine4 respectively. In this way, synthetic aligned multi-model data are obtained efficiently, including spectral images, semantic labels, depth and normal maps. Specifically, terrain elevation, street graph, building style and trees distribution are all randomly generated according realistic situation, a few of handcrafted semantic labels annotated by colors spread throughout the scene, virtual cameras moved according to realistic trajectories of unmanned aerial vehicles (UAVs). For evaluation of practicability of our dataset, we manually labeled tens of aerial images downloaded from internet. And the experiment result show that, in both pure and combined mode, the dataset can improve the performance significantly.

[1]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[3]  Vladlen Koltun,et al.  Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[5]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Wang Wei,et al.  Overview of Logistics Equilibrium Distribution Networks System: An Urban Perspective , 2018, SpaCCS.

[7]  Sanjay Chawla,et al.  Nazr-CNN: Fine-Grained Classification of UAV Imagery for Damage Assessment , 2016, 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[8]  Alan L. Yuille,et al.  UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[9]  Naveen K. Chilamkurti,et al.  Deep Learning: The Frontier for Distributed Attack Detection in Fog-to-Things Computing , 2018, IEEE Communications Magazine.

[10]  Silvio Savarese,et al.  Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[11]  Martin Weinmann,et al.  Deep cross-domain building extraction for selective depth estimation from oblique aerial imagery , 2018, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[12]  Hu Ding,et al.  Detection of Catchment-Scale Gully-Affected Areas Using Unmanned Aerial Vehicle (UAV) on the Chinese Loess Plateau , 2016, ISPRS Int. J. Geo Inf..

[13]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yunjun Yao,et al.  Use of UAV oblique imaging for the detection of individual trees in residential environments , 2015 .

[15]  Friedrich Fraundorfer,et al.  Building Detection and Segmentation Using a CNN with Automatically Generated Training Data , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[16]  Yuandong Tian,et al.  Building Generalizable Agents with a Realistic and Rich 3D Environment , 2018, ICLR.

[17]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yi Zhang,et al.  UnrealCV: Virtual Worlds for Computer Vision , 2017, ACM Multimedia.

[19]  Simon Brodeur,et al.  HoME: a Household Multimodal Environment , 2017, ICLR.

[20]  Mianxiong Dong,et al.  Deep Learning for Smart Industry: Efficient Manufacture Inspection System With Fog Computing , 2018, IEEE Transactions on Industrial Informatics.

[21]  Md Zakirul Alam Bhuiyan,et al.  Deep Irregular Convolutional Residual LSTM for Urban Traffic Passenger Flows Prediction , 2020, IEEE Transactions on Intelligent Transportation Systems.

[22]  George Vosselman,et al.  Towards a more efficient detection of earthquake induced façade damages using oblique UAV imagery. , 2017 .

[23]  Norman Kerle,et al.  UAV-based urban structural damage assessment using object-based image analysis and semantic reasoning , 2014 .

[24]  Markus Gerke,et al.  The ISPRS benchmark on urban object classification and 3D building reconstruction , 2012 .