Scene Classification Based on Two-Stage Deep Feature Fusion

In convolutional neural networks (CNNs), higher layer information is more abstract and more task specific, so people usually concern themselves with fully connected (FC) layer features, believing that lower layer features are less discriminative. However, a few researchers showed that the lower layers also provide very rich and powerful information for image representation. In view of these study findings, in this letter, we attempt to adaptively and explicitly combine the activations from intermediate and FC layers to generate a new CNN with directed acyclic graph topology, which is called the converted CNN. After that, two converted CNNs are integrated together to further improve the classification performance. We validate our proposed two-stage deep feature fusion model over two publicly available remote sensing data sets, and achieve a state-of-the-art performance in scene classification tasks.

[1]  Qian Du,et al.  Fusing Local and Global Features for High-Resolution Scene Classification , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hongxun Yao,et al.  Deep Feature Fusion for VHR Remote Sensing Scene Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Songfan Yang,et al.  Multi-scale Recognition with DAG-CNNs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Weihua Su,et al.  Hierarchical Coding Vectors for Scene Level Land-Use Classification , 2016, Remote. Sens..

[6]  Larry S. Davis,et al.  Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[10]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[11]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[12]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[14]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Chao Huang,et al.  Scene Classification via Triplet Networks , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[16]  Weihua Su,et al.  Deep Filter Banks for Land-Use Scene Classification , 2016, IEEE Geoscience and Remote Sensing Letters.

[17]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Anton van den Hengel,et al.  The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).