A Deep Cross-Modality Hashing Network for SAR and Optical Remote Sensing Images Retrieval

The content-based remote sensing image retrieval (CBRSIR) has recently become a hot topic due to its wide applications in analysis of remote sensing data. However, since conventional CBRSIR is unsuitable in harsh environments, this article focuses on the cross-modality CBRSIR (CM-CBRSIR) between synthetic aperture radar (SAR) and optical images. Besides the large interclass and small intraclass in CBRSIR, CM-CBRSIR is limited by prominent modality discrepancy caused by different imaging mechanisms. To address this limitation, this study proposes a deep cross-modality hashing network. First, we transform optical images with three channels into four different types of single-channel images to increase diversity of the training modalities. This helps the network to mainly focus on extracting the contour and texture shared features and makes it less sensitive to color information for images across modalities. Second, we combine any type of randomly selected transformed images and its corresponding SAR or optical images to form image pairs that are fed into the networks. The training strategy, with paired image data, eliminates the large cross-modality variations caused by different modalities. Finally, the triplet loss, in combination with the hash function, helps the modal to extract the discriminative features of images and upgrade the retrieval efficiency. To further evaluate the proposed modality, we construct a SAR-optical dual-modality remote sensing image dataset containing 12 categories. Experimental results demonstrate the superiority of the proposed method with regards to efficiency and generality.

[1]  Edward K. Wong,et al.  Deepshape: Deep learned shape descriptor for 3D shape matching and retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Mihai Datcu,et al.  Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation , 2017, IEEE Transactions on Big Data.

[3]  Yin Pan,et al.  Cloud Detection in Remote Sensing Images Based on Multiscale Features-Convolutional Neural Network , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Xiangtao Zheng,et al.  Remote Sensing Image Generation From Audio , 2020 .

[5]  Malcolm Davidson,et al.  GMES Sentinel-1 mission , 2012 .

[6]  Wei Luo,et al.  Remote Sensing Image Retrieval Using Convolutional Neural Network Features and Weighted Distance , 2018, IEEE Geoscience and Remote Sensing Letters.

[7]  Philip S. Yu,et al.  Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.

[8]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Qingshan Liu,et al.  Learning Multiscale Deep Features for High-Resolution Satellite Image Scene Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Ke Yang,et al.  Performance Evaluation of Single-Label and Multi-Label Remote Sensing Image Retrieval Using a Dense Labeling Dataset , 2018, Remote. Sens..

[12]  Lei Wang,et al.  Remote Sensing Image Super-Resolution Using Sparse Representation and Coupled Sparse Autoencoder , 2019, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[13]  Zhenfeng Shao,et al.  PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[14]  Matthias Drusch,et al.  Sentinel-2: ESA's Optical High-Resolution Mission for GMES Operational Services , 2012 .

[15]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Wu-Jun Li,et al.  Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Shuang Wang,et al.  SAR Images Retrieval Based on Semantic Classification and Region-Based Similarity Measure for Earth Observation , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[18]  Mihai Datcu,et al.  SAR image content retrieval by speckle robust compression based methods , 2014 .

[19]  Yang Yang,et al.  Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[20]  Wei Xiong,et al.  A Discriminative Distillation Network for Cross-Source Remote Sensing Image Retrieval , 2020, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[21]  Erchan Aptoula,et al.  Remote Sensing Image Retrieval With Global Morphological Texture Descriptors , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Xiangtao Zheng,et al.  Exploring Models and Data for Remote Sensing Image Caption Generation , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[24]  Min Wang,et al.  Remote Sensing Image Retrieval by Scene Semantic Matching , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Xiao Xiang Zhu,et al.  The SEN1-2 Dataset for Deep Learning in SAR-Optical Data Fusion , 2018, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[26]  Wei Xiong,et al.  A Discriminative Feature Learning Approach for Remote Sensing Image Retrieval , 2019, Remote. Sens..

[27]  Rongrong Ji,et al.  Similarity-Preserving Linkage Hashing for Online Image Retrieval , 2020, IEEE Transactions on Image Processing.

[28]  Wei Luo,et al.  SAR Image Retrieval Based on Unsupervised Domain Adaptation and Clustering , 2019, IEEE Geoscience and Remote Sensing Letters.

[29]  Bin Li,et al.  SAR image retrieval based-on fly algorithm , 2018, 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI).

[30]  Yongjun Zhang,et al.  Large-Scale Remote Sensing Image Retrieval by Deep Hashing Neural Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Jianbo Liu,et al.  An improved Bag-of-Words framework for remote sensing image retrieval in large-scale image databases , 2015, Int. J. Digit. Earth.

[32]  Wu-Jun Li,et al.  Discrete Latent Factor Model for Cross-Modal Hashing , 2017, IEEE Transactions on Image Processing.

[33]  G. Healey,et al.  Retrieving Multispectral Satellite Images Using Physics-Based Invariant Representations , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Antonio J. Plaza,et al.  Adaptive Deep Pyramid Matching for Remote Sensing Scene Classification , 2016, ArXiv.

[35]  Xiangtao Zheng,et al.  Retrieval Topic Recurrent Memory Network for Remote Sensing Image Captioning , 2020, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[36]  Qingshan Liu,et al.  Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification , 2017, Remote. Sens..

[37]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[38]  Ning Lv,et al.  Cross-Modality Person Re-Identification Based on Dual-Path Multi-Branch Network , 2019, IEEE Sensors Journal.

[39]  William J. Emery,et al.  SAR Image Content Retrieval Based on Fuzzy Similarity and Relevance Feedback , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[40]  Ming-Hsuan Yang,et al.  Dynamic Match Kernel With Deep Convolutional Features for Image Retrieval , 2018, IEEE Transactions on Image Processing.

[41]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[42]  Wei Zhao,et al.  Multitask Learning for Cross-Domain Image Captioning , 2019, IEEE Transactions on Multimedia.

[43]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[44]  Jon Atli Benediktsson,et al.  Big Data for Remote Sensing: Challenges and Opportunities , 2016, Proceedings of the IEEE.

[45]  Liwei Wang,et al.  Learning Two-Branch Neural Networks for Image-Text Matching Tasks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Yuan Yuan,et al.  Deep Cross-Modal Retrieval for Remote Sensing Image and Audio , 2018, 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS).

[47]  Chu-Song Chen,et al.  Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[49]  Xiangtao Zheng,et al.  Sound Active Attention Framework for Remote Sensing Image Captioning , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[50]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[51]  Yun Ge,et al.  Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval , 2018, Multimedia Tools and Applications.

[52]  Zhongyuan Wang,et al.  Saliency-Aware Convolution Neural Network for Ship Detection in Surveillance Video , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[53]  Shawn D. Newsam,et al.  Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval , 2016, Remote. Sens..

[54]  Albert Y. Zomaya,et al.  Estimating the Statistical Characteristics of Remote Sensing Big Data in the Wavelet Transform Domain , 2014, IEEE Transactions on Emerging Topics in Computing.

[55]  Xiongkuo Min,et al.  A Multimodal Saliency Model for Videos With High Audio-Visual Correspondence , 2020, IEEE Transactions on Image Processing.

[56]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[57]  Wei Xiong,et al.  Learning to Translate for Cross-Source Remote Sensing Image Retrieval , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[58]  Qimin Cheng,et al.  Multilabel Remote Sensing Image Retrieval Based on Fully Convolutional Network , 2020, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[59]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[60]  Licheng Jiao,et al.  Fusion Similarity-Based Reranking for SAR Image Retrieval , 2017, IEEE Geoscience and Remote Sensing Letters.

[61]  Zhenfeng Shao,et al.  BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction From High-Resolution Remote Sensing Images , 2020, Remote. Sens..

[62]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Yongjun Zhang,et al.  Learning Source-Invariant Deep Hashing Convolutional Neural Networks for Cross-Source Remote Sensing Image Retrieval , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[64]  Lorenzo Bruzzone,et al.  An Unsupervised Multicode Hashing Method for Accurate and Scalable Remote Sensing Image Retrieval , 2019, IEEE Geoscience and Remote Sensing Letters.

[65]  Gaël Richard,et al.  Weakly Supervised Representation Learning for Audio-Visual Scene Analysis , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[66]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.