Attention-Based DSM Fusion Network for Semantic Segmentation of High-Resolution Remote-Sensing Images

For semantic segmentation of high-resolution remote-sensing images, digital surface models (DSMs) information is useful for improving the accuracy and robustness of the segmentation models. However, since the feature distributions of spectral and DSM images vary significantly in different scenes, it is difficult to fuse them effectively in popular deep network models. To solve this issue, we propose an attention-based DSM fusion network (ADF-Net) for high-resolution remote-sensing image semantic segmentation. The proposed network makes two contributions. The first is that we design an attention-based feature fusion module, which can selectively gather features from spectral and DSM information by channel attention mechanism, and further combine them to get high-quality fusion features. The second is that we introduce a residual feature refinement module to reduce the redundant information from skip connection adaptively. We evaluate the proposed network on the ISPRS Vaihingen and Potsdam datasets, experimental results demonstrate that our model outperforms state-of-the-art methods.

[1]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ronghua Shang,et al.  Densely Based Multi-Scale and Multi-Modal Fully Convolutional Networks for High-Resolution Remote-Sensing Image Semantic Segmentation , 2019, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Qilong Wang,et al.  ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Xuelong Li,et al.  Semi-Supervised Multitask Learning for Scene Recognition , 2015, IEEE Transactions on Cybernetics.

[5]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Menglong Yan,et al.  End-to-End DSM Fusion Networks for Semantic Segmentation in High-Resolution Aerial Images , 2019, IEEE Geoscience and Remote Sensing Letters.

[7]  Bertrand Le Saux,et al.  Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[8]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[9]  Rongjun Qin,et al.  A Hierarchical Building Detection Method for Very High Resolution Remotely Sensed Images Combined with DSM Using Graph Cut Optimization , 2014 .

[10]  Hangbin Wu,et al.  Urban Land Cover Classification of High-Resolution Aerial Imagery Using a Relation-Enhanced Multiscale Convolutional Network , 2020, Remote. Sens..