Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images

This letter investigates fully convolutional networks (FCNs) for the detection of informal settlements in very high resolution (VHR) satellite images. Informal settlements or slums are proliferating in developing countries and their detection and classification provides vital information for decision making and planning urban upgrading processes. Distinguishing different urban structures in VHR images is challenging because of the abstract semantic definition of the classes as opposed to the separation of standard land-cover classes. This task requires extraction of texture and spatial features. To this aim, we introduce deep FCNs to perform pixel-wise image labeling by automatically learning a higher level representation of the data. Deep FCNs can learn a hierarchy of features associated to increasing levels of abstraction, from raw pixel values to edges and corners up to complex spatial patterns. We present a deep FCN using dilated convolutions of increasing spatial support. It is capable of learning informative features capturing long-range pixel dependencies while keeping a limited number of network parameters. Experiments carried out on a Quickbird image acquired over the city of Dar es Salaam, Tanzania, show that the proposed FCN outperforms state-of-the-art convolutional networks. Moreover, the computational cost of the proposed technique is significantly lower than standard patch-based architectures.

[1]  Monika Kuffer,et al.  Extraction of Slum Areas From VHR Imagery Using GLCM Variance , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[4]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[5]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[6]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Claudio Persello,et al.  A deep learning approach to the classification of sub-decimetre resolution aerial images , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[8]  Russell G. Congalton,et al.  Assessing the accuracy of remotely sensed data : principles and practices , 1998 .

[9]  Monika Kuffer,et al.  Slums from Space - 15 Years of Slum Mapping Using Remote Sensing , 2016, Remote. Sens..

[10]  M. Huchzermeyer,et al.  Informal Settlements , 2002 .

[11]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[12]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .