Self-supervised learning for joint SAR and multispectral land cover classification

Self-supervised learning techniques are gaining popularity due to their capability of building models that are effective, even when scarce amounts of labeled data are available. In this paper, we present a framework and specific tasks for self-supervised training of multichannel models, such as the fusion of multispectral and synthetic aperture radar images. We show that the proposed self-supervised approach is highly effective at learning features that correlate with the labels for land cover classification. This is enabled by an explicit design of pretraining tasks which promotes bridging the gaps between sensing modalities and exploiting the spectral characteristics of the input. When limited labels are available, using the proposed self-supervised pretraining and supervised finetuning for land cover classification with SAR and multispectral data outperforms conventional approaches such as purely supervised learning, initialization from training on Imagenet and recent selfsupervised approaches for computer vision tasks.

[1]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[3]  Derek Anderson,et al.  Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community , 2017 .

[4]  Matthias Drusch,et al.  Sentinel-2: ESA's Optical High-Resolution Mission for GMES Operational Services , 2012 .

[5]  Tao Kong,et al.  Dense Contrastive Learning for Self-Supervised Visual Pre-Training , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Haifeng Li,et al.  Remote Sensing Image Scene Classification With Self-Supervised Paradigm Under Limited Labeled Samples , 2020, IEEE Geoscience and Remote Sensing Letters.

[7]  Antonio J. Plaza,et al.  Deep Unsupervised Embedding for Remotely Sensed Images Based on Spatially Augmented Momentum Contrast , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Gregory Shakhnarovich,et al.  Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yun Zhang,et al.  Very Deep Convolutional Neural Networks for Complex Land Cover Mapping Using Multispectral Remote Sensing Imagery , 2018, Remote. Sens..

[10]  Alexander Kolesnikov,et al.  Revisiting Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Nasir Ahmad,et al.  K-Means and ISODATA Clustering Algorithms for Landcover Classification Using Remote Sensing , 2016 .

[12]  Alexei A. Efros,et al.  What Should Not Be Contrastive in Contrastive Learning , 2020, ICLR.

[13]  Julien Mairal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[14]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Rouzbeh A. Shirvani,et al.  Natural Language Processing Advancements By Deep Learning: A Survey , 2020, ArXiv.

[16]  Uwe Stilla,et al.  Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[17]  Amir Rasouli,et al.  Deep Learning for Vision-based Prediction: A Survey , 2020, ArXiv.

[18]  Naoto Yokoya,et al.  Global Land-Cover Mapping With Weak Supervision: Outcome of the 2020 IEEE GRSS Data Fusion Contest , 2021, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[19]  Xiao Xiang Zhu,et al.  SEN12MS - A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion , 2019, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[20]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[22]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Vladimir Risojevic,et al.  Self-Supervised Learning of Remote Sensing Scene Representations Using Contrastive Multiview Coding , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Fillia Makedon,et al.  A Survey on Contrastive Self-supervised Learning , 2020, Technologies.

[25]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[26]  Abhinav Gupta,et al.  Scaling and Benchmarking Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Malcolm Davidson,et al.  GMES Sentinel-1 mission , 2012 .

[28]  Jinmu Choi,et al.  A hybrid approach to urban land use/cover mapping using Landsat 7 Enhanced Thematic Mapper Plus (ETM+) images , 2004 .

[29]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Simon D. Jones,et al.  The Performance of Random Forests in an Operational Setting for Large Area Sclerophyll Forest Classification , 2013, Remote. Sens..

[32]  Simone Calderara,et al.  The color out of space: learning self-supervised representations for Earth Observation imagery , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[33]  Yingli Tian,et al.  Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Xueliang Zhang,et al.  Deep learning in remote sensing applications: A meta-analysis and review , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[35]  Jing Liu,et al.  Scene Segmentation With Dual Relation-Aware Attention Network , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Shoichiro Takeda,et al.  Multiple Pretext-Task for Self-Supervised Learning via Mixing Multiple Image Transformations , 2019, ArXiv.

[37]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[38]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.