ConvTransNet: A CNN–Transformer Network for Change Detection With Multiscale Global–Local Representations

Change detection (CD) in optical remote sensing images has significantly benefited from the development of deep convolutional neural networks (CNNs) due to their strong capability of local modeling in bitemporal images. In addition, the recent rise of transformer modules has led to the improvement of global feature extraction of bitemporal remote sensing images. Note that the existing simple cascade of deep CNNs and transformer modules shows limited CD performance on small changed areas due to deficiencies of multiscale information therein. To address the aforementioned issue, we propose a new CNN–transformer network (ConvTransNet) with multiscale framework to better exploit global–local information in optical remote sensing images. In our ConvTransNet, we propose the parallel-branch ConvTrans block as the basic component to generate global–local features, i.e., adaptively integrates the global features summarized by a transformer-based branch and the local features extracted by a convolution-based branch, providing better identifiability between changed areas and unchanged areas. By fusing multiple global–local features with different scales, our ConvTransNet improves the robustness of the CD performance on changed areas with different sizes, especially small changed areas. Experiments on two public CD datasets of optical remote sensing images, i.e., LEVIR-CD and CDD, demonstrate that our ConvTransNet achieves enhanced CD performance than the other commonly used methods.

[1]  Gang Li,et al.  MCTNet: A Multi-Scale CNN-Transformer Network for Change Detection in Optical Remote Sensing Images , 2022, 2023 26th International Conference on Information Fusion (FUSION).

[2]  Xipeng Pan,et al.  PHTrans: Parallelly Aggregating Global and Local Representations for Medical Image Segmentation , 2022, MICCAI.

[3]  Vishal M. Patel,et al.  UNeXt: MLP-based Rapid Medical Image Segmentation Network , 2022, MICCAI.

[4]  Zia Khan,et al.  Deep Learning-Based Change Detection in Remote Sensing Images: A Review , 2022, Remote Sensing.

[5]  Zhuo Zheng,et al.  Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to man-made disasters , 2021 .

[6]  Hong Huang,et al.  When CNNs Meet Vision Transformer: A Joint Framework for Remote Sensing Scene Classification , 2021, IEEE Geoscience and Remote Sensing Letters.

[7]  Liang Gao,et al.  STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation , 2021, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[8]  Changjing Shang,et al.  High-resolution triplet network with dynamic multiscale feature for change detection on satellite images , 2021, ISPRS Journal of Photogrammetry and Remote Sensing.

[9]  Liangpei Zhang,et al.  A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Yaowei Wang,et al.  Conformer: Local Features Coupling Global Representations for Visual Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  N. Codella,et al.  CvT: Introducing Convolutions to Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  You He,et al.  A Semisupervised Siamese Network for Efficient Change Detection in Heterogeneous Remote Sensing Images , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Zhenwei Shi,et al.  Remote Sensing Image Change Detection With Transformers , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Xiang Li,et al.  Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Jinyuan Shao,et al.  SNUNet-CD: A Densely Connected Siamese Network for Change Detection of VHR Images , 2021, IEEE Geoscience and Remote Sensing Letters.

[16]  Francis E. H. Tay,et al.  Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Matthieu Cord,et al.  Training data-efficient image transformers & distillation through attention , 2020, ICML.

[18]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[19]  Peng Yue,et al.  A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images , 2020 .

[20]  Hao Chen,et al.  A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection , 2020, Remote. Sens..

[21]  Li Chen,et al.  DASNet: Dual Attentive Fully Convolutional Siamese Networks for Change Detection in High-Resolution Satellite Images , 2020, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[22]  You He,et al.  Building Damage Detection via Superpixel-Based Belief Fusion of Space-Borne SAR and Optical Images , 2020, IEEE Sensors Journal.

[23]  AI Memoodver Sheet,et al.  Received By , 2020, Definitions.

[24]  Xiangyun Hu,et al.  PGA-SiamNet: Pyramid Feature-Based Attention-Guided Siamese Network for Remote Sensing Orthoimagery Building Change Detection , 2020, Remote. Sens..

[25]  Yongjun Zhang,et al.  End-to-End Change Detection for High Resolution Satellite Images Using Improved UNet++ , 2019, Remote. Sens..

[26]  Jian Yang,et al.  Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Meng Lu,et al.  A scale robust convolutional neural network for automatic building extraction from aerial and satellite imagery , 2018, International Journal of Remote Sensing.

[28]  Alexandre Boulch,et al.  Fully Convolutional Siamese Networks for Change Detection , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[29]  Nima Tajbakhsh,et al.  UNet++: A Nested U-Net Architecture for Medical Image Segmentation , 2018, DLMIA/ML-CDS@MICCAI.

[30]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[31]  Yury Vizilter,et al.  CHANGE DETECTION IN REMOTE SENSING IMAGES USING CONDITIONAL ADVERSARIAL NETWORKS , 2018, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[32]  Bo Du,et al.  A post-classification change detection method based on iterative slow feature analysis and Bayesian soft fusion , 2017, Remote Sensing of Environment.

[33]  Menglong Yan,et al.  Change Detection Based on Deep Siamese Convolutional Network for Optical Aerial Images , 2017, IEEE Geoscience and Remote Sensing Letters.

[34]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[35]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[36]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.

[37]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[39]  Christian Szegedy,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[40]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[41]  Ola Ahlqvist,et al.  Extending post-classification change detection using semantic similarity metrics to overcome class heterogeneity: A study of 1992 and 2001 U.S. National Land Cover Database changes , 2008 .

[42]  Suming Jin,et al.  Comparison of time series tasseled cap wetness and the normalized difference moisture index in detecting forest disturbances , 2005 .

[43]  M. Ridd,et al.  A Comparison of Four Algorithms for Change Detection in an Urban Environment , 1998 .

[44]  John A. Richards,et al.  Thematic mapping from multitemporal image data using the principal components transformation , 1984 .

[45]  R. Jackson Spectral indices in N-Space , 1983 .

[46]  Philip J. Howarth,et al.  Procedures for change detection using Landsat digital data , 1981 .

[47]  Jianwei Zheng,et al.  ICIF-Net: Intra-Scale Cross-Interaction and Inter-Scale Feature Fusion Network for Bitemporal Remote Sensing Images Change Detection , 2022, IEEE Transactions on Geoscience and Remote Sensing.

[48]  Haojun Deng,et al.  A CNN-Transformer Network With Multiscale Context Aggregation for Fine-Grained Cropland Change Detection , 2022, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[49]  Ruofei Zhong,et al.  TransUNetCD: A Hybrid Transformer Network for Change Detection in Optical Remote-Sensing Images , 2022, IEEE Transactions on Geoscience and Remote Sensing.