Self-constructing graph neural networks to model long-range pixel dependencies for semantic segmentation of remote sensing images

ABSTRACT Capturing global contextual representations in remote sensing images by exploiting long-range pixel–pixel dependencies has been shown to improve segmentation performance. However, how to do this efficiently is an open question as current approaches of utilizing attention schemes, or very deep models to increase the field of view, increases complexity and memory consumption. Inspired by recent work on graph neural networks, we propose the Self-Constructing Graph (SCG) module that learns a long-range dependency graph directly from the image data and uses it to capture global contextual information efficiently to improve semantic segmentation. The SCG module provides a high degree of flexibility for constructing segmentation networks that seamlessly make use of the benefits of variants of graph neural networks (GNN) and convolutional neural networks (CNN). Our SCG-GCN model, a variant of SCG-Net built upon graph convolutional networks (GCN), performs semantic segmentation in an end-to-end manner with competitive performance on the publicly available ISPRS Potsdam and Vaihingen datasets, achieving a mean F1-scores of 92.0% and 89.8%, respectively. We conclude that the SCG-Net is an attractive architecture for semantic segmentation of remote sensing images since it achieves competitive performance with much fewer parameters and lower computational cost compared to related models based on convolutional neural networks.

[1]  Xiangyu Zhang,et al.  Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Qinghui Liu,et al.  A Comparison of Deep Learning Architectures for Semantic Mapping of Very High Resolution Images , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.

[3]  Jamie Sherrah,et al.  Effective semantic pixel labelling with convolutional networks and Conditional Random Fields , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jie Jiang,et al.  RWSNet: a semantic segmentation network based on SegNet combined with random walk for remote sensing , 2020, International Journal of Remote Sensing.

[6]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[7]  Bertrand Le Saux,et al.  Joint Learning from Earth Observation and OpenStreetMap Data to Get Faster Better Semantic Maps , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Xiao Huang,et al.  Towards Deeper Graph Neural Networks with Differentiable Group Normalization , 2020, NeurIPS.

[9]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Xiao Lin,et al.  Image Classification with Hierarchical Multigraph Networks , 2019, BMVC.

[11]  Hassan Ghassemian,et al.  Building extraction from very high-resolution synthetic aperture radar images based on statistical and structural information fusion , 2019, International Journal of Remote Sensing.

[12]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[13]  Markus Gerke,et al.  The ISPRS benchmark on urban object classification and 3D building reconstruction , 2012 .

[14]  Michael Kampffmeyer,et al.  Self-Constructing Graph Convolutional Networks for Semantic Labeling , 2020, ArXiv.

[15]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[16]  Alexandr A. Kalinin,et al.  Albumentations: fast and flexible image augmentations , 2018, Inf..

[17]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Sanja Fidler,et al.  3D Graph Neural Networks for RGBD Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Michael Kampffmeyer,et al.  Urban Land Cover Classification With Missing Data Modalities Using Deep Convolutional Neural Networks , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[20]  A CNN-GCN Framework for Multi-Label Aerial Image Scene Classification , 2020, IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium.

[21]  Uwe Stilla,et al.  Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection , 2016, ISPRS Journal of Photogrammetry and Remote Sensing.

[22]  Xiaogang Wang,et al.  Context Encoding for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[24]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[25]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[26]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[27]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[28]  Stephen Lin,et al.  GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[29]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[31]  Xiao Xiang Zhu,et al.  RiFCN: Recurrent Network in Fully Convolutional Network for Semantic Segmentation of High Resolution Remote Sensing Images , 2018, ArXiv.

[32]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[35]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[36]  Pierre Vandergheynst,et al.  Wavelets on Graphs via Spectral Graph Theory , 2009, ArXiv.

[37]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[38]  Ying Wang,et al.  Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images , 2017, Remote. Sens..

[39]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[40]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[41]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[42]  Abhinav Gupta,et al.  The More You Know: Using Knowledge Graphs for Image Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Samy Bengio,et al.  Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks , 2019, KDD.

[44]  F. Chung,et al.  Spectra of random graphs with given expected degrees , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Yansheng Li,et al.  Combining Deep Semantic Segmentation Network and Graph Convolutional Neural Network for Semantic Segmentation of Remote Sensing Imagery , 2020, Remote. Sens..

[46]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[47]  Michael Kampffmeyer,et al.  Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images , 2019, 2019 Joint Urban Remote Sensing Event (JURSE).

[48]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[49]  Sildomar T. Monteiro,et al.  Semantic segmentation of multisensor remote sensing imagery with deep ConvNets and higher-order conditional random fields , 2019, Journal of Applied Remote Sensing.

[50]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Qinghui Liu,et al.  Dense Dilated Convolutions’ Merging Network for Land Cover Classification , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[53]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[54]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[55]  Yichen Wei,et al.  Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Martin Simonovsky,et al.  Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57]  Min Deng,et al.  Geospatial relation captioning for high-spatial-resolution images by using an attention-based neural network , 2019, International Journal of Remote Sensing.

[58]  Michael Kampffmeyer,et al.  Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[59]  Michael Christian Kampffmeyer Advancing Segmentation and Unsupervised Learning Within the Field of Deep Learning , 2018 .

[60]  Sen Jia,et al.  How Much Position Information Do Convolutional Neural Networks Encode? , 2020, ICLR.

[61]  Eric P. Xing,et al.  Symbolic Graph Reasoning Meets Convolutions , 2018, NeurIPS.

[62]  Hao Wang,et al.  Rethinking Knowledge Graph Propagation for Zero-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[64]  Bertrand Le Saux,et al.  Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks , 2016, ACCV.

[65]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.