Robust Space–Frequency Joint Representation for Remote Sensing Image Scene Classification

Remote sensing image scene classification is a fundamental problem, which aims to label an image with a specific semantic category automatically. Recent progress on remote sensing image scene classification is substantial, benefitting mostly from the powerful feature extraction capability of convolutional neural networks (CNNs). Even though these CNN-based methods have achieved competitive performances, they only construct the representation of the image in location-sensitive space-domain. As a result, their representations are not robust to rotation-variant remote sensing images, which influence the classification accuracy. In this paper, we propose a novel feature representation method by introducing a frequency-domain branch to the traditional only-space-domain architecture. Our framework takes full advantages of discriminative features from space domain and location-robust features from the frequency domain, providing more advanced representations through an additional joint learning module, a property that is critically needed to perform remote sensing image scene classification. Additionally, our method produces satisfactory performances on four public and challenging remote sensing image scene data sets, Sydney, UC-Merced, WHU-RS19, and AID.

[1]  Wen Yang,et al.  STRUCTURAL HIGH-RESOLUTION SATELLITE IMAGE INDEXING , 2010 .

[2]  Retno Kusumaningrum,et al.  Integrated visual vocabulary in latent Dirichlet allocation–based scene classification for IKONOS image , 2014 .

[3]  Vladimir Risojevic,et al.  Aerial image classification using structural texture similarity , 2011, 2011 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[4]  Naif Alajlan,et al.  Using convolutional features and a sparse autoencoder for land-use scene classification , 2016 .

[5]  이주연,et al.  Latent Dirichlet Allocation (LDA) 모델 기반의 인공지능(A.I.) 기술 관련 연구 활동 및 동향 분석 , 2018 .

[6]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[9]  Shawn D. Newsam,et al.  Comparing SIFT descriptors and gabor texture features for classification of remote sensed imagery , 2008, 2008 15th IEEE International Conference on Image Processing.

[10]  Zhi Zhang,et al.  Fast Deep Neural Networks With Knowledge Guided Training and Predicted Regions of Interests for Real-Time Video Object Detection , 2018, IEEE Access.

[11]  Jefersson Alex dos Santos,et al.  Evaluating the Potential of Texture and Color Descriptors for Remote Sensing Image Retrieval and Classification , 2010, VISAPP.

[12]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[13]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[14]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[15]  Lei Guo,et al.  When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Lei Guo,et al.  Remote Sensing Image Scene Classification Using Bag of Convolutional Features , 2017, IEEE Geoscience and Remote Sensing Letters.

[17]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[18]  Bo Du,et al.  Scene Classification via a Gradient Boosting Random Convolutional Network Framework , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[19]  Hong Huo,et al.  Local feature representation based on linear filtering with feature pooling and divisive normalization for remote sensing image classification , 2017 .

[20]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Qian Song,et al.  Exploring the Use of Google Earth Imagery and Object-Based Methods in Land Use/Cover Mapping , 2013, Remote. Sens..

[22]  Ivan Donadello,et al.  Semantic Image Interpretation - Integration of Numerical Data and Logical Knowledge for Cognitive Vision , 2018 .

[23]  Chuan Wang,et al.  Look, Listen and Learn - A Multimodal LSTM for Speaker Identification , 2016, AAAI.

[24]  周达标 Zhou Da-biao,et al.  A destriping method with multi-scale variational model for remote sensing images , 2017 .

[25]  Rongjun Qin,et al.  Multi-level monitoring of subtle urban changes for the megacities of China using high-resolution multi-view satellite imagery , 2017 .

[26]  Jian Cheng,et al.  From Hashing to CNNs: Training BinaryWeight Networks via Hashing , 2018, AAAI.

[27]  Hongxun Yao,et al.  Deep Feature Fusion for VHR Remote Sensing Scene Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[32]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[33]  Brian P. Salmon,et al.  Multiview Deep Learning for Land-Use Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[34]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Naif Alajlan,et al.  Land-Use Classification With Compressive Sensing Multifeature Fusion , 2015, IEEE Geoscience and Remote Sensing Letters.

[37]  Xiaojin Gong,et al.  Saliency Guided End-to-End Learning for Weakly Supervised Object Detection , 2017, IJCAI.

[38]  Huy Phan,et al.  Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks , 2016, INTERSPEECH.

[39]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[41]  Hong Tang,et al.  DropBand: A Simple and Effective Method for Promoting the Scene Classification Accuracy of Convolutional Neural Networks for VHR Remote Sensing Imagery , 2018, IEEE Geoscience and Remote Sensing Letters.

[42]  Qingshan Liu,et al.  Learning Multiscale Deep Features for High-Resolution Satellite Image Scene Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[44]  Hui Liu,et al.  Spatiotemporal Detection and Analysis of Urban Villages in Mega City Regions of China Using High-Resolution Remotely Sensed Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[45]  Lei Guo,et al.  Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[46]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Liangpei Zhang,et al.  A Deep-Local-Global Feature Fusion Framework for High Spatial Resolution Imagery Scene Classification , 2018, Remote. Sens..

[48]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..