Satellite Image Scene Classification via ConvNet With Context Aggregation

Scene classification is a fundamental problem to understand the high-resolution remote sensing imagery. Recently, convolutional neural network (ConvNet) has achieved remarkable performance in different tasks, and significant efforts have been made to develop various representations for satellite image scene classification. In this paper, we present a novel representation based on a ConvNet with context aggregation. The proposed two-pathway ResNet (ResNet-TP) architecture adopts the ResNet [1] as backbone, and the two pathways allow the network to model both local details and regional context. The ResNet-TP based representation is generated by global average pooling on the last convolutional layers from both pathways. Experiments on two scene classification datasets, UCM Land Use and NWPU-RESISC45, show that the proposed mechanism achieves promising improvements over state-of-the-art methods.

[1]  Shiming Xiang,et al.  Aggregating Rich Hierarchical Features for Scene Classification in Remote Sensing Imagery , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Lei Guo,et al.  Remote Sensing Image Scene Classification Using Bag of Convolutional Features , 2017, IEEE Geoscience and Remote Sensing Letters.

[3]  Paolo Napoletano,et al.  Remote Sensing Image Classification Exploiting Multiple Kernel Learning , 2015, IEEE Geoscience and Remote Sensing Letters.

[4]  Liangpei Zhang,et al.  Pre-Trained AlexNet Architecture with Pyramid Pooling and Supervision for High Spatial Resolution Remote Sensing Image Scene Classification , 2017, Remote. Sens..

[5]  Curt H. Davis,et al.  Fusion of Deep Convolutional Neural Networks for Land Cover Classification of High-Resolution Imagery , 2017, IEEE Geoscience and Remote Sensing Letters.

[6]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[8]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[11]  Heng Wang,et al.  Dense Dilated Network for Few Shot Action Recognition , 2018, ICMR.

[12]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[16]  Li Wang,et al.  Learning Multiviewpoint Context-Aware Representation for RGB-D Scene Classification , 2018, IEEE Signal Processing Letters.

[17]  Qingshan Liu,et al.  Learning Multiscale Deep Features for High-Resolution Satellite Image Scene Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Gregory D. Hager,et al.  Temporal Convolutional Networks for Action Segmentation and Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Curt H. Davis,et al.  Training Deep Convolutional Neural Networks for Land–Cover Classification of High-Resolution Imagery , 2017, IEEE Geoscience and Remote Sensing Letters.

[23]  Chao Huang,et al.  Scene Classification via Triplet Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24]  Alexander M. Rush,et al.  Dilated Convolutions for Modeling Long-Distance Genomic Dependencies , 2017, bioRxiv.

[25]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.