Lightweight Attention Network for Very High-Resolution Image Semantic Segmentation

Semantic segmentation is one of the most challenging tasks for very high-resolution (VHR) remote sensing applications. Deep convolutional neural networks (DCNNs) based on the attention mechanism have shown outstanding performance in VHR remote sensing images semantic segmentation. However, the existing attention-guided methods require the estimation of a large number of parameters that are affected by the limited number of available labeled samples that results in underperforming segmentation results. In this article, we propose a multistage feature fusion lightweight (MSFFL) model to greatly reduce the number of parameters and improve the accuracy of semantic segmentation. In this model, two parallel enhanced attention modules, i.e., the spatial attention module (SAM) and the channel attention module (CAM), are designed by introducing encoding position information. Then, a covariance calculation strategy is adopted to recalibrate the generated attention maps. The integration of enhanced attention modules into the proposed lightweight module results in an efficient lightweight attention network (LiANet). The performance of the proposed LiANet is assessed on two benchmark datasets. Experimental results demonstrate that LiANet can achieve promising performance with a small number of parameters.

[1]  Z. Gao,et al.  MFVNet: a deep adaptive fusion network with multiple field-of-views for remote sensing image semantic segmentation , 2023, Science China Information Sciences.

[2]  Shunyi Zheng,et al.  A2-FPN for semantic segmentation of fine-resolution remotely sensed images , 2022, International Journal of Remote Sensing.

[3]  Hong Zhang,et al.  Semantic Segmentation With Attention Mechanism for Remote Sensing Images , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Jiayi Ma,et al.  Cross Fusion Net: A Fast Semantic Segmentation Network for Small-Scale Semantic Information Capturing in Aerial Scenes , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Rui Li,et al.  Multistage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images , 2020, IEEE Geoscience and Remote Sensing Letters.

[6]  Rui Li,et al.  Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Pongsak Lasang,et al.  Covariance Attention for Semantic Segmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ruigang Niu,et al.  Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[9]  J. Chanussot,et al.  Progress and Challenges in Intelligent Remote Sensing Satellite Systems , 2022, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[10]  Hong Huo,et al.  MFALNet: A Multiscale Feature Aggregation Lightweight Network for Semantic Segmentation of High-Resolution Remote Sensing Images , 2021, IEEE Geoscience and Remote Sensing Letters.

[11]  Zhenzhong Chen,et al.  AFNet: Adaptive Fusion Network for Remote Sensing Image Semantic Segmentation , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Bi-Yuan Liu,et al.  Improved Deeplabv3 For Better Road Segmentation In Remote Sensing Images , 2021, 2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI).

[13]  Changtao He,et al.  Dual Lightweight Network with Attention and Feature Fusion for Semantic Segmentation of High-Resolution Remote Sensing Images , 2021, 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS.

[14]  Rahul Gomes,et al.  Deep Learning optimization in remote sensing image segmentation using dilated convolutions and ShuffleNet , 2021, 2021 IEEE International Conference on Electro Information Technology (EIT).

[15]  Jiashi Feng,et al.  Coordinate Attention for Efficient Mobile Network Design , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  X. Mei,et al.  SCAttNet: Semantic Segmentation Network With Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images , 2019, IEEE Geoscience and Remote Sensing Letters.

[17]  P. Atkinson,et al.  ABCNet: Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery ISPRS Journal of Photogrammetry and Remote Sensing , 2021 .

[18]  Haowen Yan,et al.  Real-Time Semantic Segmentation of Remote Sensing Images Based on Bilateral Attention Refined Network , 2021, IEEE Access.

[19]  Shichen Liu,et al.  A Lightweight and Efficient Network for Logistics Truck Scene Semantic Segmentation , 2020, 2020 IEEE 6th International Conference on Computer and Communications (ICCC).

[20]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Xuegang Hu,et al.  LDPNet: A Lightweight Densely Connected Pyramid Network for Real-Time Semantic Segmentation , 2020, IEEE Access.

[22]  Gang Zhang,et al.  A Dual-Path and Lightweight Convolutional Neural Network for High-Resolution Aerial Image Segmentation , 2019, ISPRS Int. J. Geo Inf..

[23]  Chi-Wing Fu,et al.  Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Qingcai Chen,et al.  Unconstrained Offline Handwritten Word Recognition by Position Embedding Integrated ResNets Model , 2019, IEEE Signal Processing Letters.

[25]  Yunchao Wei,et al.  CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jocelyn Chanussot,et al.  Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[28]  Yi Zhang,et al.  PSANet: Point-wise Spatial Attention Network for Scene Parsing , 2018, ECCV.

[29]  Jingdong Wang,et al.  OCNet: Object Context Network for Scene Parsing , 2018, ArXiv.

[30]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[31]  Gang Wang,et al.  Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Xiaogang Wang,et al.  Context Encoding for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Michele Volpi,et al.  Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[34]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[35]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[40]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[45]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[46]  Antoine Collin,et al.  Merging land-marine realms: Spatial patterns of seamless coastal habitats using a multispectral LiDAR , 2012 .

[47]  K. Seto,et al.  Mapping urbanization dynamics at regional and global scales using multi-temporal DMSP/OLS nighttime light data , 2011 .

[48]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Jian Yang,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..