Pruning Convolutional Neural Networks with an Attention Mechanism for Remote Sensing Image Classification

Despite the great success of Convolutional Neural Networks (CNNs) in various visual recognition tasks, the high computational and storage costs of such deep networks impede their deployments in real-time remote sensing tasks. To this end, considerable attention has been given to the filter pruning techniques, which enable slimming deep networks with acceptable performance drops and thus implementing them on the remote sensing devices. In this paper, we propose a new scheme, termed Pruning Filter with Attention Mechanism (PFAM), to compress and accelerate traditional CNNs. In particular, a novel correlation-based filter pruning criterion, which explores the long-range dependencies among filters via an attention module, is employed to select the to-be-pruned filters. Distinct from previous methods, the less correlated filters are first pruned after the pruning stage in the current training epoch, and they are reconstructed and updated during the next training epoch. Doing so allows manipulating input data with the maximum information preserved when executing the original training strategy such that the compressed network model can be obtained without the need for the pretrained model. The proposed method is evaluated on three public remote sensing image datasets, and the experimental results demonstrate its superiority, compared to state-of-the-art baselines. Specifically, PFAM achieves a 0.67% accuracy improvement with a 40% model-size reduction on the Aerial Image Dataset (AID) dataset, which is impressive.

[1]  Lijun Zhao,et al.  Remote Sensing Image Scene Classification Using CNN-CapsNet , 2019, Remote. Sens..

[2]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[3]  Ilkka Pölönen,et al.  Tree Species Classification of Drone Hyperspectral and RGB Imagery with Deep Learning Convolutional Neural Networks , 2020, Remote. Sens..

[4]  Peter H. N. de With,et al.  Broadcast Court-Net Sports Video Analysis Using Fast 3-D Camera Modeling , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Jianxin Wu,et al.  ThiNet: Pruning CNN Filters for a Thinner Net , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Vladimir V. Lukin,et al.  Efficient Discrimination and Localization of Multimodal Remote Sensing Images Using CNN-Based Prediction of Localization Uncertainty , 2020, Remote. Sens..

[9]  Qiang Ni,et al.  Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning , 2019, IEEE Transactions on Industrial Electronics.

[10]  Jungong Han,et al.  Real-Time Scalable Visual Tracking via Quadrangle Kernelized Correlation Filters , 2018, IEEE Transactions on Intelligent Transportation Systems.

[11]  Peter H. N. de With,et al.  Employing a RGB-D sensor for real-time tracking of humans across multiple re-entries in a smart environment , 2012, IEEE Transactions on Consumer Electronics.

[12]  Renato Fontes Guimarães,et al.  Change Detection of Deforestation in the Brazilian Amazon Using Landsat Data and Convolutional Neural Networks , 2020, Remote. Sens..

[13]  Ling Shao,et al.  Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier , 2017, IEEE Transactions on Image Processing.

[14]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[15]  Ling Shao,et al.  Latent Constrained Correlation Filter , 2017, IEEE Transactions on Image Processing.

[16]  Ling Shao,et al.  Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval , 2019, IEEE Transactions on Image Processing.