Deep Attention and Multi-Scale Networks for Accurate Remote Sensing Image Segmentation

Remote sensing image segmentation is a challenging task in remote sensing image analysis. Remote sensing image segmentation has great significance in urban planning, crop planting, and other fields that need plentiful information about the land. Technically, this task suffers from the ultra-high resolution, large shooting angle, and feature complexity of the remote sensing images. To address these issues, we propose a deep learning-based network called ATD-LinkNet with several customized modules. Specifically, we propose a replaceable module named AT block using multi-scale convolution and attention mechanism as the building block in ATD-LinkNet. AT block fuses different scale features and effectively utilizes the abundant spatial and semantic information in remote sensing images. To refine the nonlinear boundaries of internal objects in remote sensing images, we adopt the dense upsampling convolution in the decoder part of ATD-LinkNet. Experimentally, we enforce sufficient comparative experiments on two public remote sensing datasets (Potsdam and DeepGlobe Road Extraction). The results show our ATD-LinkNet achieves better performance against most state-of-the-art networks. We obtain 89.0% for pixel-level accuracy in the Potsdam dataset and 62.68% for mean Intersection over Union in the DeepGlobe Road Extraction dataset.

[1]  Kun Zhu,et al.  Symmetrical Dense-Shortcut Deep Fully Convolutional Networks for Semantic Segmentation of Very-High-Resolution Remote Sensing Images , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Qinghui Liu,et al.  A Comparison of Deep Learning Architectures for Semantic Mapping of Very High Resolution Images , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[3]  Tao Sun,et al.  Combining Satellite Imagery and GPS Data for Road Extraction , 2018, GeoAI@SIGSPATIAL.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Mohamed ElHelw,et al.  NU-Net: Deep Residual Wide Field of View Convolutional Neural Network for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Song Wang,et al.  Optimal control research on a manipulator’s combined feedback device by the variational method genetic algorithm radial basis function method , 2019 .

[7]  Bhu Dev Sharma,et al.  Remote Sensing Image Registration Techniques: A Survey , 2010, ICISP.

[8]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[9]  Ghassan Hamarneh,et al.  n -SIFT: n -Dimensional Scale Invariant Feature Transform , 2009, IEEE Trans. Image Process..

[10]  Xingqun Qi The Understanding of Convolutional Neuron Network Family , 2017 .

[11]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[12]  Jocelyn Chanussot,et al.  Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Lei He,et al.  Road Extraction from Unmanned Aerial Vehicle Remote Sensing Images Based on Improved Neural Networks , 2019, Sensors.

[15]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[16]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[17]  Antonio Plaza,et al.  Remote Sensing Image Superresolution Using Deep Residual Channel Attention , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[19]  Paul M. Mather,et al.  Support vector machines for classification in remote sensing , 2005 .

[20]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21]  Bertrand Le Saux,et al.  Joint Learning from Earth Observation and OpenStreetMap Data to Get Faster Better Semantic Maps , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Gang Fu,et al.  Classification for High Resolution Remote Sensing Imagery Using a Fully Convolutional Network , 2017, Remote. Sens..

[23]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Eli Saber,et al.  Supervised Classification of Multisensor Remotely Sensed Images Using a Deep Learning Framework , 2018, Remote. Sens..

[25]  Hiroshi Tani,et al.  A simple method for detection and counting of oil palm trees using high-resolution multispectral satellite imagery , 2016 .

[26]  Yin Wang,et al.  Stacked U-Nets with Multi-output for Road Extraction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28]  Na Liu,et al.  Signal Separation of Phase-sensitive Optical Time-domain Reflectometry Considering Thermo-mechanical Coupling and 3D Data Matching , 2019, Traitement du Signal.

[29]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jigar Doshi,et al.  Residual Inception Skip Network for Binary Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[32]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Ronald Kemker,et al.  Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[35]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[37]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[38]  Xingqun Qi,et al.  Comparison of Support Vector Machine and Softmax Classifiers in Computer Vision , 2017, 2017 Second International Conference on Mechanical, Control and Computer Engineering (ICMCCE).

[39]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[40]  Ming Wu,et al.  D-LinkNet: LinkNet with Pretrained Encoder and Dilated Convolution for High Resolution Satellite Imagery Road Extraction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[41]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Markus Gerke,et al.  The ISPRS benchmark on urban object classification and 3D building reconstruction , 2012 .

[43]  Jing Huang,et al.  DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45]  W. O. Saxton,et al.  Digital image processing: The semper system , 1979 .

[46]  Markus Gerke,et al.  Use of the stair vision library within the ISPRS 2D semantic labeling benchmark (Vaihingen) , 2014 .

[47]  Michael Kampffmeyer,et al.  Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[48]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Sankar K. Pal,et al.  Segmentation of multispectral remote sensing images using active support vector machines , 2004, Pattern Recognit. Lett..

[51]  Eugenio Culurciello,et al.  LinkNet: Exploiting encoder representations for efficient semantic segmentation , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[52]  Alan L. Yuille,et al.  Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net , 2015, ECCV.

[53]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[54]  Amir Hossein Alavi,et al.  Machine learning in geosciences and remote sensing , 2016 .

[55]  Oleksandr Filin,et al.  Road Detection with EOSResUNet and Post Vectorizing Algorithm , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[56]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[57]  Bo Du,et al.  Deep Learning for Remote Sensing Data: A Technical Tutorial on the State of the Art , 2016, IEEE Geoscience and Remote Sensing Magazine.

[58]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Sildomar T. Monteiro,et al.  Dense Semantic Labeling of Very-High-Resolution Aerial Imagery and LiDAR with Fully-Convolutional Neural Networks and Higher-Order CRFs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[60]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[62]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[63]  Michael A. Wulder,et al.  Automated derivation of geographic window sizes for use in remote sensing digital image texture analysis , 1996 .

[64]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[65]  Nataliia Kussul,et al.  Deep Learning Classification of Land Cover and Crop Types Using Remote Sensing Data , 2017, IEEE Geoscience and Remote Sensing Letters.

[66]  Yi Yang,et al.  Attention to Scale: Scale-Aware Semantic Image Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.