AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference

Channel pruning is an important family of methods to speed up deep model's inference. Previous filter pruning algorithms regard channel pruning and model fine-tuning as two independent steps. This paper argues that combining them into a single end-to-end trainable system will lead to better results. We propose an efficient channel selection layer, namely AutoPruner, to find less important filters automatically in a joint training manner. Our AutoPruner takes previous activation responses as an input and generates a true binary index code for pruning. Hence, all the filters corresponding to zero index values can be removed safely after training. We empirically demonstrate that the gradient information of this channel selection layer is also helpful for the whole model training. By gradually erasing several weak filters, we can prevent an excessive drop in model accuracy. Compared with previous state-of-the-art pruning algorithms (including training from scratch), AutoPruner achieves significantly better performance. Furthermore, ablation experiments show that the proposed novel mini-batch pooling and binarization operations are vital for the success of filter pruning.

[1]  Jian Cheng,et al.  Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning , 2016, ArXiv.

[4]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[5]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Rongrong Ji,et al.  Holistic CNN Compression via Low-Rank Decomposition with Knowledge Transfer , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[8]  Jun Sun,et al.  Building Fast and Compact Convolutional Neural Networks for Offline Handwritten Chinese Character Recognition , 2017, Pattern Recognit..

[9]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[10]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[11]  Afshin Abdi,et al.  Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee , 2016, NIPS.

[12]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[13]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[14]  Kai Chen,et al.  Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition , 2019, Pattern Recognit..

[15]  Xuelong Li,et al.  Towards Convolutional Neural Networks Compression via Global Error Reconstruction , 2016, IJCAI.

[16]  Rongrong Ji,et al.  Accelerating Convolutional Networks via Global & Dynamic Filter Pruning , 2018, IJCAI.

[17]  Ning Xu,et al.  Slimmable Neural Networks , 2018, ICLR.

[18]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[19]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Liujuan Cao,et al.  Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Cheng-Lin Liu,et al.  LightweightNet: Toward fast and lightweight convolutional neural networks via architecture distillation , 2019, Pattern Recognit..

[24]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[25]  Cheng-Zhong Xu,et al.  Dynamic Channel Pruning: Feature Boosting and Suppression , 2018, ICLR.

[26]  Yi Yang,et al.  More is Less: A More Complicated Network with Less Inference Complexity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[29]  Fang Liu,et al.  Learning Intrinsic Sparse Structures within Long Short-term Memory , 2017, ICLR.

[30]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[31]  Xuelong Li,et al.  Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning , 2019, ArXiv.

[32]  Max Welling,et al.  Bayesian Compression for Deep Learning , 2017, NIPS.

[33]  Naiyan Wang,et al.  Data-Driven Sparse Structure Selection for Deep Neural Networks , 2017, ECCV.

[34]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[35]  Jianxin Wu,et al.  ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[37]  Jian Sun,et al.  Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[39]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[40]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[41]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[42]  Mathieu Salzmann,et al.  Compression-aware Training of Deep Networks , 2017, NIPS.

[43]  Vinay P. Namboodiri,et al.  Play and Prune: Adaptive Filter Pruning for Deep Model Compression , 2019, IJCAI.

[44]  Larry S. Davis,et al.  NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Pavlo Molchanov,et al.  Importance Estimation for Neural Network Pruning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jiwen Lu,et al.  Runtime Neural Pruning , 2017, NIPS.

[47]  Victor S. Lempitsky,et al.  Fast ConvNets Using Group-Wise Brain Damage , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Jianxin Wu,et al.  ThiNet: Pruning CNN Filters for a Thinner Net , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[50]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[51]  Mingjie Sun,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[52]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Huan Wang,et al.  Structured Pruning for Efficient ConvNets via Incremental Regularization , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).