Weakly Supervised Domain Adaptation for Built-up Region Segmentation in Aerial and Satellite Imagery

Abstract This paper proposes a novel domain adaptation algorithm to handle the challenges posed by the satellite and aerial imagery, and demonstrates its effectiveness on the built-up region segmentation problem. Built-up area estimation is an important component in understanding the human impact on the environment, effect of public policy and in general urban population analysis. The diverse nature of aerial and satellite imagery (capturing different geographical locations, terrains and weather conditions) and lack of labeled data covering this diversity makes machine learning algorithms difficult to generalize for such tasks, especially across multiple domains. Re-training for new domain is both computationally and labor expansive mainly due to the cost of collecting pixel level labels required for the segmentation task. Domain adaptation algorithms have been proposed to enable algorithms trained on images of one domain (source) to work on images from other dataset (target). Unsupervised domain adaptation is a popular choice since it allows the trained model to adapt without requiring any ground-truth information of the target domain. On the other hand, due to the lack of strong spatial context and structure, in comparison to the ground imagery, application of existing unsupervised domain adaptation methods results in the sub-optimal adaptation. We thoroughly study limitations of existing domain adaptation methods and propose a weakly-supervised adaptation strategy where we assume image level labels are available for the target domain. More specifically, we design a built-up area segmentation network (as encoder-decoder), with image classification head added to guide the adaptation. The devised system is able to address the problem of visual differences in multiple satellite and aerial imagery datasets, ranging from high resolution (HR) to very high resolution (VHR), by investigating the latent space as well as the structured output space. A realistic and challenging HR dataset is created by hand-tagging the 73.4 sq-km of Rwanda, capturing a variety of build-up structures over different terrain. The developed dataset is spatially rich compared to existing datasets and covers diverse built-up scenarios including built-up areas in forests and deserts, mud houses, tin and colored rooftops. Extensive experiments are performed by adapting from the single-source domain datasets, such as Massachusetts Buildings Dataset, to segment out the target domain. We achieve high gains ranging 11.6–52% in IoU over the existing state-of-the-art methods.

[1]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[2]  Budhendra L. Bhaduri,et al.  Estimating urban areas: New insights from very high-resolution human settlement data , 2018 .

[3]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Daniel Cremers,et al.  FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture , 2016, ACCV.

[5]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[6]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[7]  Qingquan Li,et al.  Representation of Block-Based Image Features in a Multi-Scale Framework for Built-Up Area Detection , 2016, Remote. Sens..

[8]  Christian Heipke,et al.  CLASSIFICATION OF LAND COVER AND LAND USE BASED ON CONVOLUTIONAL NEURAL NETWORKS , 2018 .

[9]  Swami Sankaranarayanan,et al.  Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Jiayi Ma,et al.  Urban Area Detection in Very High Resolution Remote Sensing Images Using Deep Convolutional Neural Networks , 2018, Sensors.

[11]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Xianming Liu,et al.  Building Detection from Satellite Images on a Global Scale , 2017, ArXiv.

[13]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Barbara Caputo,et al.  Boosting Domain Adaptation by Discovering Latent Domains , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Sohaib Khan,et al.  VillageFinder: Segmentation of Nucleated Villages in Satellite Imagery , 2009, BMVC.

[16]  Min Sun,et al.  No More Discrimination: Cross City Adaptation of Road Scene Segmenters , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Mohsen Ali,et al.  Deep Built-Structure Counting in Satellite Imagery Using Attention Based Re-Weighting , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[18]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[19]  Markus Gerke,et al.  The ISPRS benchmark on urban object classification and 3D building reconstruction , 2012 .

[20]  Wei You,et al.  Detecting the Boundaries of Urban Areas in India: A Dataset for Pixel-Based Image Classification in Google Earth Engine , 2016, Remote. Sens..

[21]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  Yihua Tan,et al.  Cauchy Graph Embedding Optimization for Built-Up Areas Detection From High-Resolution Remote Sensing Images , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24]  Dong Liu,et al.  Fully Convolutional Adaptation Networks for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Yihua Tan,et al.  Precise Extraction of Built-Up Area Using Deep Features , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[26]  Chen Peng,et al.  Urban Built-Up Area Extraction From Log- Transformed NPP-VIIRS Nighttime Light Composite Data , 2018, IEEE Geoscience and Remote Sensing Letters.

[27]  Bertrand Le Saux,et al.  Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[28]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[31]  Anis Koubaa,et al.  Unsupervised Domain Adaptation using Generative Adversarial Networks for Semantic Segmentation of Aerial Images , 2019, Remote. Sens..

[32]  Xiao Xiang Zhu,et al.  Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources , 2017, IEEE Geoscience and Remote Sensing Magazine.

[33]  Takayoshi Yamashita,et al.  Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks , 2016, IRIACV.

[34]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Sanjay Kumar Ghosh,et al.  EXTRACTION OF BUILT-UP AREAS USING CONVOLUTIONAL NEURAL NETWORKS AND TRANSFER LEARNING FROM SENTINEL-2 SATELLITE IMAGES , 2018 .

[38]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Pierre Alliez,et al.  ColorMapGAN: Unsupervised Domain Adaptation for Semantic Segmentation Using Color Mapping Generative Adversarial Networks , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[40]  Cem Ünsalan,et al.  A Probabilistic Framework to Detect Buildings in Aerial and Satellite Images , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Gabriela Csurka,et al.  A Simple High Performance Approach to Semantic Segmentation , 2008, BMVC.

[42]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Noel E. O'Connor,et al.  People, Penguins and Petri Dishes: Adapting Object Counting Models to New Visual Domains and Object Types Without Forgetting , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[45]  Yan Qin,et al.  Extracting and analyzing urban built-up area based on impervious surface and gravity model , 2013, Joint Urban Remote Sensing Event 2013.

[46]  Naijie Gu,et al.  SCNet: A simplified encoder-decoder CNN for semantic segmentation , 2016, 2016 5th International Conference on Computer Science and Network Technology (ICCSNT).

[47]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[48]  Qingshan Liu,et al.  Learning Multiscale Deep Features for High-Resolution Satellite Image Scene Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[49]  Jing Huang,et al.  DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50]  Suha Kwak,et al.  Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Luc Van Gool,et al.  ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Bertrand Le Saux,et al.  Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks , 2016, ACCV.

[53]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54]  Geoffrey E. Hinton,et al.  Machine Learning for Aerial Image Labeling , 2013 .

[55]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Fan Zhang,et al.  TreeSegNet: Automatically Constructed Tree CNNs for Subdecimeter Aerial Image Segmentation , 2018, ArXiv.

[57]  Christoph H. Lampert,et al.  Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation , 2016, ECCV.

[58]  Tao Zhang,et al.  Built-Up Area Extraction from Landsat 8 Images Using Convolutional Neural Networks with Massive Automatically Selected Samples , 2018, PRCV.

[59]  Enrico Magli,et al.  Learning and Adapting Robust Features for Satellite Image Segmentation on Heterogeneous Data Sets , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[60]  Mohsen Ali,et al.  Destruction from sky: Weakly supervised approach for destruction detection in satellite imagery , 2020 .

[61]  Cem Ünsalan,et al.  Urban-Area and Building Detection Using SIFT Keypoints and Graph Theory , 2009, IEEE Transactions on Geoscience and Remote Sensing.