A Deep Scene Representation for Aerial Scene Classification

As a fundamental problem in earth observation, aerial scene classification tries to assign a specific semantic label to an aerial image. In recent years, the deep convolutional neural networks (CNNs) have shown advanced performances in aerial scene classification. The successful pretrained CNNs can be transferable to aerial images. However, global CNN activations may lack geometric invariance and, therefore, limit the improvement of aerial scene classification. To address this problem, this paper proposes a deep scene representation to achieve the invariance of CNN features and further enhance the discriminative power. The proposed method: 1) extracts CNN activations from the last convolutional layer of pretrained CNN; 2) performs multiscale pooling (MSP) on these activations; and 3) builds a holistic representation by the Fisher vector method. MSP is a simple and effective multiscale strategy, which enriches multiscale spatial information in affordable computational time. The proposed representation is particularly suited at aerial scenes and consistently outperforms global CNN activations without requiring feature adaptation. Extensive experiments on five aerial scene data sets indicate that the proposed method, even with a simple linear classifier, can achieve the state-of-the-art performance.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[3]  Lei Guo,et al.  When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Xiangtao Zheng,et al.  Discovering Diverse Subset for Unsupervised Hyperspectral Band Selection , 2017, IEEE Transactions on Image Processing.

[5]  Supratik Mukhopadhyay,et al.  DeepSat: a learning framework for satellite imagery , 2015, SIGSPATIAL/GIS.

[6]  Yunlong Yu,et al.  Aerial Scene Classification via Multilevel Fusion Based on Deep Convolutional Neural Networks , 2018, IEEE Geoscience and Remote Sensing Letters.

[7]  Xiangtao Zheng,et al.  A target detection method for hyperspectral image based on mixture noise model , 2016, Neurocomputing.

[8]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[9]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Lei Guo,et al.  Remote Sensing Image Scene Classification Using Bag of Convolutional Features , 2017, IEEE Geoscience and Remote Sensing Letters.

[11]  Ping Zhong,et al.  Diversity-Promoting Deep Structural Metric Learning for Remote Sensing Scene Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Yakoub Bazi,et al.  Asymmetric Adaptation of Deep Features for Cross-Domain Classification in Remote Sensing Imagery , 2018, IEEE Geoscience and Remote Sensing Letters.

[13]  Hongxun Yao,et al.  Deep Feature Fusion for VHR Remote Sensing Scene Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Naif Alajlan,et al.  Land-Use Classification With Compressive Sensing Multifeature Fusion , 2015, IEEE Geoscience and Remote Sensing Letters.

[15]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[16]  Xiangtao Zheng,et al.  Dimensionality Reduction by Spatial–Spectral Preservation in Selected Bands , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[18]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[19]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[20]  Cong Lin,et al.  Integrating Multilayer Features of Convolutional Neural Networks for Remote Sensing Scene Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[21]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[22]  Yi Yang,et al.  A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Xiangtao Zheng,et al.  Hyperspectral Image Superresolution by Transfer Learning , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[25]  Yizhou Yu,et al.  Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Ping Tang,et al.  Land-Use Scene Classification Using a Concentric Circle-Structured Multiscale Bag-of-Visual-Words Model , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[30]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Vladimir Risojevic,et al.  Gabor Descriptors for Aerial Image Classification , 2011, ICANNGA.

[32]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[33]  Peng Liu,et al.  Link the remote sensing big data to the image features via wavelet transformation , 2016, Cluster Computing.

[34]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[35]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[36]  Xiangtao Zheng,et al.  Remote Sensing Scene Classification by Unsupervised Representation Learning , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[37]  Xiangtao Zheng,et al.  Spectral–Spatial Kernel Regularized for Hyperspectral Image Denoising , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[38]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Michael Felsberg,et al.  Convolutional Features for Correlation Filter Based Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[40]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[41]  Feng Wu,et al.  Background Prior-Based Salient Object Detection via Deep Reconstruction Residual , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Xiangtao Zheng,et al.  Exploring Models and Data for Remote Sensing Image Caption Generation , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Shiming Xiang,et al.  Aggregating Rich Hierarchical Features for Scene Classification in Remote Sensing Imagery , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[45]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[46]  Subhransu Maji,et al.  Deep filter banks for texture recognition and segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[48]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[49]  Dengxin Dai,et al.  Satellite Image Classification via Two-Layer Sparse Coding With Biased Image Representation , 2011, IEEE Geoscience and Remote Sensing Letters.

[50]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[51]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[52]  Xiangtao Zheng,et al.  Joint Dictionary Learning for Multispectral Change Detection , 2017, IEEE Transactions on Cybernetics.