An enhanced 3D model and generative adversarial network for automated generation of horizontal building mask images and cloudless aerial photographs

Abstract Information extracted from aerial photographs is widely used in the fields of urban planning and design. An effective method for detecting buildings in aerial photographs is to use deep learning to understand the current state of a target region. However, the building mask images used to train the deep learning model must be manually generated in many cases. To overcome this challenge, a method has been proposed for automatically generating mask images by using textured three-dimensional (3D) virtual models with aerial photographs. Some aerial photographs include clouds, which degrade image quality. These clouds can be removed by using a generative adversarial network (GAN), which leads to improvements in training quality. Therefore, the objective of this research was to propose a method for automatically generating building mask images by using 3D virtual models with textured aerial photographs. In this study, using GAN to remove clouds in aerial photographs improved training quality. A model trained on datasets generated by the proposed method was able to detect buildings in aerial photographs with IoU = 0.651.

[1]  Shiyong Cui,et al.  Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Miki Haseyama,et al.  Restoration of missing intensity of still images by using optical flow , 2006, Systems and Computers in Japan.

[4]  Tree height measurement from aerial images taken by a small Unmanned Aerial Vehicle using Structure from Motion , 2015 .

[5]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Nobuyoshi Yabuki,et al.  Automatic Object Detection from Digital Images by Deep Learning with Transfer Learning , 2018, EG-ICE.

[7]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[8]  Nobuyoshi Yabuki,et al.  Diminished reality system with real-time object detection using deep learning for onsite landscape simulation during redevelopment , 2020, Environ. Model. Softw..

[9]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Qian Xie,et al.  Automatic defect detection of metro tunnel surfaces using a vision-based inspection system , 2021, Adv. Eng. Informatics.

[11]  Howie Choset,et al.  xBD: A Dataset for Assessing Building Damage from Satellite Imagery , 2019, ArXiv.

[12]  Jaewook Jung,et al.  Building Extraction from Satellite Images Using Mask R-CNN with Building Boundary Regularization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Jun Li,et al.  Thin cloud removal in optical remote sensing images based on generative adversarial networks and physical model of cloud distortion , 2020 .

[17]  Andrew Lauritzen,et al.  Hybrid Rendering for Real-Time Ray Tracing , 2019, Ray Tracing Gems.

[18]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[24]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[25]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[26]  Jian Sun,et al.  Image Completion Approaches Using the Statistics of Similar Patches , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Wei Wang,et al.  CNN based suburban building detection using monocular high resolution Google Earth images , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[28]  Naokazu Yokoya,et al.  Image Inpainting Considering Brightness Change and Spatial Locality of Textures , 2009, VISAPP.

[29]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[31]  Assaf Zomet,et al.  Learning how to inpaint from global image statistics , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[32]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[33]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[35]  Stephan K. Chalup,et al.  Training Deep Neural Networks for Detecting Drinking Glasses Using Synthetic Images , 2017, ICONIP.

[36]  Tomohiro Fukuda,et al.  Virtual reality rendering methods for training deep learning, analysing landscapes, and preventing virtual reality sickness , 2020, International Journal of Architectural Computing.

[37]  Ratna Babu Chinnam,et al.  SPA-GAN: Spatial Attention GAN for Image-to-Image Translation , 2019, IEEE Transactions on Multimedia.

[38]  Pingbo Tang,et al.  Augmenting a deep-learning algorithm with canal inspection knowledge for reliable water leak detection from multispectral satellite images , 2020, Adv. Eng. Informatics.

[39]  Loong Fah Cheong,et al.  Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Amir H. Behzadan,et al.  Convolutional neural networks for object detection in aerial imagery for disaster response and recovery , 2020, Adv. Eng. Informatics.

[41]  Vishal M. Patel,et al.  Image De-Raining Using a Conditional Generative Adversarial Network , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  K. Zhou,et al.  BUILDING SEGMENTATION FROM AIRBORNE VHR IMAGES USING MASK R-CNN , 2019 .

[43]  Abdulhakim M. Abdi,et al.  Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data , 2019, GIScience & Remote Sensing.