Remote-sensing image retrieval with tree-triplet-classification networks

Abstract For the past few years, convolutional neural networks (CNNs) have played a dominant role in content-based remote sensing image retrieval (CBRSIR) because of their markedly superior performance. However, most of the CNN models used for CBRSIR were originally meant for image classification instead of for image retrieval. We argue that triplet networks designed in the context of metric learning are more natural and suitable for CBRSIR. However, they only use information about whether or not two input images belong to the same class, and fail to fully exploit class labels. Besides, all existing CNN-based CBRSIR methods ignore prior knowledge about interclass relationship, which, if used properly, can greatly improve retrieval performance. To address these issues, we introduce an easy way to organize semantic relationship among classes as a category tree, and propose a novel CNN model called tree-triplet-classification (T-T-C) network, the key characteristics of which can be summarized as follows: firstly, a T-T-C network integrates metric learning with classification prediction, simultaneously learning similarity measurement and categorizing images and, hence, taking advantage of the complementary capabilities of the existing CNN-based approaches; secondly, the loss functions coupled with a T-T-C network lay emphasis on the structure of a category tree, using prior semantic knowledge to adaptively adjust the “pull-push” mechanism during training; finally, a T-T-C network is lightweight, and its features are very compact. We carry out extensive experiments over publicly available datasets, and achieve a state-of-the-art retrieval performance.

[1]  Ewa Kijak,et al.  Retrieval of Remote Sensing Images with Pattern Spectra Descriptors , 2016, ISPRS Int. J. Geo Inf..

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[4]  Sartaj Sahni,et al.  Handbook of Data Structures and Applications , 2004 .

[5]  Peng Li,et al.  Region-Wise Deep Feature Representation for Remote Sensing Images , 2018, Remote. Sens..

[6]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[8]  Medeni Soysal,et al.  Performance Analysis of State-of-the-Art Representation Methods for Geographical Image Retrieval and Categorization , 2014, IEEE Geoscience and Remote Sensing Letters.

[9]  Yongjun Zhang,et al.  Large-Scale Remote Sensing Image Retrieval by Deep Hashing Neural Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Marc Donias,et al.  Structure Tensor Riemannian Statistical Models for CBIR and Classification of Remote Sensing Images , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[12]  Lei Guo,et al.  When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[14]  Zhenfeng Shao,et al.  Region Convolutional Features for Multi-Label Remote Sensing Image Retrieval , 2018, ArXiv.

[15]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[17]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Ching Y. Suen,et al.  Scene Classification Using Hierarchical Wasserstein CNN , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[19]  Mihai Datcu,et al.  Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation , 2017, IEEE Transactions on Big Data.

[20]  Yishu Liu,et al.  High-Resolution Remote Sensing Image Retrieval Based on Classification-Similarity Networks and Double Fusion , 2020, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[21]  Chao Huang,et al.  Scene Classification via Triplet Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[22]  Yang Long,et al.  High-Resolution Remote Sensing Image Retrieval Based on CNNs from a Dimensional Perspective , 2017, Remote. Sens..

[23]  Erchan Aptoula Bag of morphological words for content-based geographical retrieval , 2014, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI).

[24]  Ke Yang,et al.  Performance Evaluation of Single-Label and Multi-Label Remote Sensing Image Retrieval Using a Dense Labeling Dataset , 2018, Remote. Sens..

[25]  Yongjun Zhang,et al.  Learning Source-Invariant Deep Hashing Convolutional Neural Networks for Cross-Source Remote Sensing Image Retrieval , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Hao Liu,et al.  A Three-Layered Graph-Based Learning Approach for Remote Sensing Image Retrieval , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[28]  Yun Ge,et al.  Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval , 2018, Multimedia Tools and Applications.

[29]  Shawn D. Newsam,et al.  Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval , 2016, Remote. Sens..

[30]  Paolo Napoletano,et al.  Visual descriptors for content-based retrieval of remote-sensing images , 2016, ArXiv.

[31]  Erchan Aptoula,et al.  Remote Sensing Image Retrieval With Global Morphological Texture Descriptors , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[32]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[33]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[35]  Fei-Fei Li,et al.  Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.

[36]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[37]  Yingbin Liu,et al.  Similarity-Based Unsupervised Deep Transfer Learning for Remote Sensing Image Retrieval , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[38]  Shawn D. Newsam,et al.  Geographic Image Retrieval Using Local Invariant Features , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[39]  Wei Luo,et al.  Remote Sensing Image Retrieval Using Convolutional Neural Network Features and Weighted Distance , 2018, IEEE Geoscience and Remote Sensing Letters.

[40]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[41]  Xuelong Li,et al.  Local structure learning in high resolution remote sensing image retrieval , 2016, Neurocomputing.

[42]  Yang Song,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Zhenfeng Shao,et al.  PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.