Combining Multilevel Features for Remote Sensing Image Scene Classification With Attention Model

Remote sensing (RS) image scene classification is a challenging task due to its intraclass variety and the interclass similarity. Recently, many convolutional neural network (CNN)-based methods explore the network to handle this task. However, RS images usually have confusing background in addition to the relevant objects, and features only derived from the whole RS images cannot achieve satisfying results. To solve the problem, this letter proposed a method of utilizing the attention network to localize multiscale discriminative regions of the RS scene images and combining features learned from the localized regions by a classification network. Specifically, the classification network is composed of three subnetworks, which are trained by certain scaled regions separately. To learn more discriminative feature representations, feature fusion module is introduced to fuse the features of the three subnetworks in a more effective way. Experiments conducted on the AID and NWPU-RESISC45 data sets evaluate the effectiveness of the proposed method.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hongxun Yao,et al.  Deep Feature Fusion for VHR Remote Sensing Scene Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Trevor Darrell,et al.  Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Bo Du,et al.  Hyperspectral Target Detection via Adaptive Information - Theoretic Metric Learning with Local Constraints , 2018, Remote. Sens..

[6]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[7]  Bo Du,et al.  Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding , 2015, Pattern Recognit..

[8]  Bingbing Ni,et al.  Deep Regression Tracking with Shrinkage Loss , 2018, ECCV.

[9]  Gui-Song Xia,et al.  Dirichlet-Derived Multiple Topic Scene Classification Model for High Spatial Resolution Remote Sensing Imagery , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Ling Shao,et al.  See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Cong Lin,et al.  Integrating Multilayer Features of Convolutional Neural Networks for Remote Sensing Scene Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Jane You,et al.  Hyperspectral image unsupervised classification by robust manifold matrix factorization , 2019, Inf. Sci..

[15]  Lei Guo,et al.  Remote Sensing Image Scene Classification Using Bag of Convolutional Features , 2017, IEEE Geoscience and Remote Sensing Letters.

[16]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Yuan Yan Tang,et al.  Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images , 2019, IEEE Transactions on Cybernetics.

[19]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[20]  Xuelong Li,et al.  Scene Classification With Recurrent Attention of VHR Remote Sensing Images , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[21]  Bo Du,et al.  Dimensionality Reduction and Classification of Hyperspectral Images Using Ensemble Discriminative Local Metric Learning , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Lei Guo,et al.  When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yishu Liu,et al.  Scene Classification Based on Two-Stage Deep Feature Fusion , 2018, IEEE Geoscience and Remote Sensing Letters.

[25]  Shutao Li,et al.  Remote Sensing Scene Classification Using Multilayer Stacked Covariance Pooling , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..