ABCP: Automatic Blockwise and Channelwise Network Pruning via Joint Search

Currently, an increasing number of model pruning methods are proposed to resolve the contradictions between the computer powers required by the deep learning models and the resource-constrained devices. However, most of the traditional rule-based network pruning methods can not reach a sufficient compression ratio with low accuracy loss and are time-consuming as well as laborious. In this paper, we propose Automatic Blockwise and Channel-wise Network Pruning (ABCP) to jointly search the block-wise and channel-wise pruning action with deep reinforcement learning. A joint sample algorithm is proposed to simultaneously generate the pruning choice of each residual block and the channel pruning ratio of each convolutional layer from the discrete and continuous search space respectively. The best pruning action taking both the accuracy and the complexity of the model into account is obtained finally. Compared with the traditional rule-based pruning method, this pipeline saves human labor and achieves a higher compression ratio with lower accuracy loss. Tested on the mobile robot detection dataset, the pruned YOLOv3 model saves 99.5% FLOPs, reduces 99.5% parameters, and achieves 37.3× speed up with only 2.8% mAP loss. The results of the transfer task on the sim2real detection dataset also show that our pruned model has much better robustness performance.

[1]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[2]  Zixiang Ding,et al.  BNAS: Efficient Neural Architecture Search Using Broad Scalable Architecture , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Li Gao,et al.  Compression of YOLOv3 via Block-Wise and Channel-Wise Pruning for Real-Time and Complicated Autonomous Driving Environment Sensing Applications , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[4]  Jieping Ye,et al.  AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates , 2020, AAAI.

[5]  Xiangyu Zhang,et al.  MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[7]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[8]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Chenggang Yan,et al.  Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks , 2020, IEEE Transactions on Cybernetics.

[10]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[11]  Dongbin Zhao,et al.  BiFNet: Bidirectional Fusion Network for Road Segmentation , 2020, IEEE Transactions on Cybernetics.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[14]  Jing Liu,et al.  Conditional Automated Channel Pruning for Deep Neural Networks , 2021, IEEE Signal Processing Letters.

[15]  Yujie Wang,et al.  DMCP: Differentiable Markov Channel Pruning for Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Rongrong Ji,et al.  Channel Pruning via Automatic Structure Search , 2020, IJCAI.

[17]  Tara N. Sainath,et al.  State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[19]  Pengyi Zhang,et al.  SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[20]  Naiyan Wang,et al.  Data-Driven Sparse Structure Selection for Deep Neural Networks , 2017, ECCV.

[21]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[22]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[23]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[24]  Luc Van Gool,et al.  Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[25]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Xinmei Tian,et al.  Accelerating Convolutional Neural Networks by Removing Interspatial and Interkernel Redundancies , 2020, IEEE Transactions on Cybernetics.

[27]  Yaran Chen,et al.  ModuleNet: Knowledge-Inherited Neural Architecture Search , 2020, IEEE Transactions on Cybernetics.

[28]  Zhenhua Guo,et al.  AACP: Model Compression by Accurate and Automatic Channel Pruning , 2021, ArXiv.

[29]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[30]  James Zijun Wang,et al.  Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers , 2018, ICLR.

[31]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[32]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.