Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval

Learning powerful feature representations for image retrieval has always been a challenging task in the field of remote sensing. Traditional methods focus on extracting low-level hand-crafted features which are not only time-consuming but also tend to achieve unsatisfactory performance due to the complexity of remote sensing images. In this paper, we investigate how to extract deep feature representations based on convolutional neural networks (CNNs) for high-resolution remote sensing image retrieval (HRRSIR). To this end, several effective schemes are proposed to generate powerful feature representations for HRRSIR. In the first scheme, a CNN pre-trained on a different problem is treated as a feature extractor since there are no sufficiently-sized remote sensing datasets to train a CNN from scratch. In the second scheme, we investigate learning features that are specific to our problem by first fine-tuning the pre-trained CNN on a remote sensing dataset and then proposing a novel CNN architecture based on convolutional layers and a three-layer perceptron. The novel CNN has fewer parameters than the pre-trained and fine-tuned CNNs and can learn low dimensional features from limited labelled images. The schemes are evaluated on several challenging, publicly available datasets. The results indicate that the proposed schemes, particularly the novel CNN, achieve state-of-the-art performance.

[1]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[5]  Odej Kao,et al.  Retrieval of remotely sensed imagery using spectral information content , 2002, IEEE International Geoscience and Remote Sensing Symposium.

[6]  Cem Ünsalan,et al.  Urban Area Detection Using Local Feature Points and Spatial Voting , 2010, IEEE Geoscience and Remote Sensing Letters.

[7]  Medeni Soysal,et al.  Performance Analysis of State-of-the-Art Representation Methods for Geographical Image Retrieval and Categorization , 2014, IEEE Geoscience and Remote Sensing Letters.

[8]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[9]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[11]  Yongjun Zhang,et al.  Content-Based High-Resolution Remote Sensing Image Retrieval via Unsupervised Feature Learning and Collaborative Affinity Metric Fusion , 2016, Remote. Sens..

[12]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[13]  Larry S. Davis,et al.  Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Lei Zhang,et al.  Improved color texture descriptors for remote sensing image retrieval , 2014 .

[15]  Luís Corte-Real,et al.  Automatic Image Registration Through Image Segmentation and SIFT , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Jie Lin,et al.  A practical guide to CNNs and Fisher Vectors for image instance retrieval , 2015, Signal Process..

[17]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[18]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Chi-Ren Shyu,et al.  Entropy-Balanced Bitmap Tree for Shape-Based Object Retrieval From Large-Scale Satellite Imagery Databases , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[21]  Erchan Aptoula,et al.  Remote Sensing Image Retrieval With Global Morphological Texture Descriptors , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Paolo Napoletano,et al.  Visual descriptors for content-based retrieval of remote-sensing images , 2016, ArXiv.

[23]  Zhenfeng Shao,et al.  Using no‐parameter statistic features for texture image retrieval , 2011 .

[24]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[25]  David Stutz,et al.  Neural Codes for Image Retrieval , 2015 .

[26]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[27]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[28]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[29]  Zhenfeng Shao,et al.  High-resolution remote-sensing imagery retrieval using sparse features by auto-encoder , 2015 .

[30]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[34]  Amin Sedaghat,et al.  Uniform Robust Scale-Invariant Feature Matching for Optical Remote Sensing Images , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[35]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[36]  Uwe Stilla,et al.  Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[37]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[38]  Alan F. Smeaton,et al.  Proceedings of the 23rd ACM international conference on Multimedia , 2015, MM 2015.

[39]  Anil M. Cheriyadat,et al.  Unsupervised Feature Learning for Aerial Scene Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[40]  Wen Yang,et al.  STRUCTURAL HIGH-RESOLUTION SATELLITE IMAGE INDEXING , 2010 .

[41]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[42]  Shawn D. Newsam,et al.  Geographic Image Retrieval Using Local Invariant Features , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[44]  Loris Nanni,et al.  How could a subcellular image, or a painting by Van Gogh, be similar to a great white shark or to a pizza? , 2017, Pattern Recognit. Lett..