Semantic segmentation of slums in satellite images using transfer learning on fully convolutional neural networks

Abstract Unprecedented urbanization in particular in countries of the global south result in informal urban development processes, especially in mega cities. With an estimated 1 billion slum dwellers globally, the United Nations have made the fight against poverty the number one sustainable development goal. To provide better infrastructure and thus a better life to slum dwellers, detailed information on the spatial location and size of slums is of crucial importance. In the past, remote sensing has proven to be an extremely valuable and effective tool for mapping slums. The nature of used mapping approaches by machine learning, however, made it necessary to invest a lot of effort in training the models. Recent advances in deep learning allow for transferring trained fully convolutional networks (FCN) from one data set to another. Thus, in our study we aim at analyzing transfer learning capabilities of FCNs to slum mapping in various satellite images. A model trained on very high resolution optical satellite imagery from QuickBird is transferred to Sentinel-2 and TerraSAR-X data. While free-of-charge Sentinel-2 data is widely available, its comparably lower resolution makes slum mapping a challenging task. TerraSAR-X data on the other hand, has a higher resolution and is considered a powerful data source for intra-urban structure analysis. Due to the different image characteristics of SAR compared to optical data, however, transferring the model could not improve the performance of semantic segmentation but we observe very high accuracies for mapped slums in the optical data: QuickBird image obtains 86–88% (positive prediction value and sensitivity) and a significant increase for Sentinel-2 applying transfer learning can be observed (from 38 to 55% and from 79 to 85% for PPV and sensitivity, respectively). Using transfer learning proofs extremely valuable in retrieving information on small-scaled urban structures such as slum patches even in satellite images of decametric resolution.

[1]  Monika Kuffer,et al.  The development of a morphological unplanned settlement index using very-high-resolution (VHR) imagery , 2014, Comput. Environ. Urban Syst..

[2]  Nikos Paragios,et al.  Multitemporal Very High Resolution From Space: Outcome of the 2016 IEEE GRSS Data Fusion Contest , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[4]  Pierre Alliez,et al.  High-Resolution Aerial Image Labeling With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[6]  Maoguo Gong,et al.  Feature learning and change feature classification based on deep learning for ternary change detection in SAR images , 2017 .

[7]  Monika Kuffer,et al.  Understanding heterogeneity in metropolitan India: The added value of remote sensing data for analyzing sub-standard residential areas , 2010, Int. J. Appl. Earth Obs. Geoinformation.

[8]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[9]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  George Kingsley Zipf,et al.  National Unity and Disunity, The Nation as a Bio-Social Organism. , 1941 .

[13]  Monika Kuffer,et al.  Extraction of Slum Areas From VHR Imagery Using GLCM Variance , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[14]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Philipp Rode,et al.  Living in the urban age , 2011 .

[16]  Uwe Stilla,et al.  Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[17]  Monika Kuffer,et al.  Capturing the Diversity of Deprived Areas with Image-Based Features: The Case of Mumbai , 2017, Remote. Sens..

[18]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Xiao Xiang Zhu,et al.  Identifying Corresponding Patches in SAR and Optical Images With a Pseudo-Siamese CNN , 2018, IEEE Geoscience and Remote Sensing Letters.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Gui-Song Xia,et al.  Bag-of-Visual-Words Scene Classifier With Local and Global Features for High Spatial Resolution Remote Sensing Imagery , 2016, IEEE Geoscience and Remote Sensing Letters.

[22]  Uwe Stilla,et al.  Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection , 2016, ISPRS Journal of Photogrammetry and Remote Sensing.

[23]  H. Taubenböck,et al.  Detecting social groups from space – Assessment of remote sensing-based mapped morphological slums using income data , 2018 .

[24]  Andrew Crooks,et al.  A Critical Review of High and Very High-Resolution Remote Sensing Approaches for Detecting and Mapping Slums: Trends, Challenges and Emerging Opportunities , 2018 .

[25]  Xiao Xiang Zhu,et al.  Unsupervised Spectral–Spatial Feature Learning via Deep Residual Conv–Deconv Network for Hyperspectral Image Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Naoto Yokoya,et al.  Ensemble of transfer component analysis for domain adaptation in hyperspectral remote sensing image classification , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[27]  S. Jain,et al.  Use of IKONOS satellite data to identify informal settlements in Dehradun, India , 2007 .

[28]  Hannes Taubenböck,et al.  Investigation on the separability of slums by multi-aspect TerraSAR-X dual-co-polarized high resolution spotlight images based on the multi-scale evaluation of local distributions , 2018, Int. J. Appl. Earth Obs. Geoinformation.

[29]  Alfred Stein,et al.  Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images , 2017, IEEE Geoscience and Remote Sensing Letters.

[30]  J. Avery Critical review. , 2006, The Journal of the Arkansas Medical Society.

[31]  Ryan N. Engstrom,et al.  Determining the Relationship Between Census Data and Spatial Features Derived From High-Resolution Imagery in Accra, Ghana , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[32]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[33]  Yansheng Li,et al.  Unsupervised Spectral–Spatial Feature Learning With Stacked Sparse Autoencoder for Hyperspectral Imagery Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[34]  H. Taubenböck,et al.  The similar size of slums , 2018 .

[35]  Monika Kuffer,et al.  Slums from Space - 15 Years of Slum Mapping Using Remote Sensing , 2016, Remote. Sens..

[36]  Zhenwei Shi,et al.  MugNet: Deep learning for hyperspectral image classification using limited samples , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[37]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[38]  Hanyun Wang,et al.  Hyperspectral image classification with SVM-based domain adaption classifiers , 2012, 2012 International Conference on Computer Vision in Remote Sensing.

[39]  Sang Michael Xie,et al.  Combining satellite imagery and machine learning to predict poverty , 2016, Science.

[40]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Nicu Sebe,et al.  Harnessing Lab Knowledge for Real-World Action Recognition , 2014, International Journal of Computer Vision.

[42]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Xiao Xiang Zhu,et al.  Building Instance Classification Using Street View Images , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[44]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[45]  Jefersson Alex dos Santos,et al.  Towards better exploiting convolutional neural networks for remote sensing scene classification , 2016, Pattern Recognit..

[46]  Hannes Taubenböck,et al.  Slum mapping in polarimetric SAR data using spatial features , 2017 .

[47]  Anil M. Cheriyadat,et al.  Image Based Characterization of Formal and Informal Neighborhoods in an Urban Landscape , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[48]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[49]  ImageNet Classification with Deep Convolutional Neural , 2013 .

[50]  Andrew Zisserman,et al.  Tabula rasa: Model transfer for object category detection , 2011, 2011 International Conference on Computer Vision.

[51]  Michael Wurm,et al.  Ich weiß, dass ich nichts weiß – Bevölkerungsschätzung in der Megacity Mumbai , 2015 .

[52]  David W. S. Wong,et al.  An approach to differentiate informal settlements using spectral, texture, geomorphology and road accessibility metrics , 2013 .

[53]  Xiao Xiang Zhu,et al.  Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources , 2017, IEEE Geoscience and Remote Sensing Magazine.

[54]  Ronald Kemker,et al.  Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[55]  H. Taubenböck,et al.  The morphology of the Arrival City - A global categorization based on literature surveys and remotely sensed data , 2018 .

[56]  Hui Liu,et al.  Spatiotemporal Detection and Analysis of Urban Villages in Mega City Regions of China Using High-Resolution Remotely Sensed Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.