论文信息 - Structured Probabilistic Pruning for Convolutional Neural Network Acceleration

Structured Probabilistic Pruning for Convolutional Neural Network Acceleration

Although deep Convolutional Neural Network (CNN) has shown better performance in various computer vision tasks, its application is restricted by a significant increase in storage and computation. Among CNN simplification techniques, parameter pruning is a promising approach which aims at reducing the number of weights of various layers without intensively reducing the original accuracy. In this paper, we propose a novel progressive parameter pruning method, named Structured Probabilistic Pruning (SPP), which effectively prunes weights of convolutional layers in a probabilistic manner. Specifically, unlike existing deterministic pruning approaches, where unimportant weights are permanently eliminated, SPP introduces a pruning probability for each weight, and pruning is guided by sampling from the pruning probabilities. A mechanism is designed to increase and decrease pruning probabilities based on importance criteria for the training process. Experiments show that, with 4x speedup, SPP can accelerate AlexNet with only 0.3% loss of top-5 accuracy and VGG-16 with 0.8% loss of top-5 accuracy in ImageNet classification. Moreover, SPP can be directly applied to accelerate multi-branch CNN networks, such as ResNet, without specific adaptations. Our 2x speedup ResNet-50 only suffers 0.8% loss of top-5 accuracy on ImageNet. We further prove the effectiveness of our method on transfer learning task on Flower-102 dataset with AlexNet.

[1] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[2] Jeff Johnson,et al. Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.

[3] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[5] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.

[6] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Jian Cheng,et al. From Hashing to CNNs: Training BinaryWeight Networks via Hashing , 2018, AAAI.

[9] Xiaolin Hu,et al. Accelerating convolutional neural networks by group-wise 2D-filter pruning , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[10] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[11] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[12] Wonyong Sung,et al. Compact Deep Convolutional Neural Networks With Coarse Pruning , 2016, ArXiv.

[13] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[14] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[15] Max Welling,et al. Bayesian Compression for Deep Learning , 2017, NIPS.

[16] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[17] Andrew Zisserman,et al. Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[18] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[20] Yixin Chen,et al. Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[21] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[22] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[23] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.

[26] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[27] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[29] Yoshua Bengio,et al. Neural Networks with Few Multiplications , 2015, ICLR.

[30] Ivan V. Oseledets,et al. Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[31] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[32] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Dmitry P. Vetrov,et al. Structured Bayesian Pruning via Log-Normal Multiplicative Noise , 2017, NIPS.

[35] Patrice Y. Simard,et al. High Performance Convolutional Neural Networks for Document Processing , 2006 .

[36] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37] Yoshua Bengio,et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.

[38] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[39] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[40] Alexander Novikov,et al. Tensorizing Neural Networks , 2015, NIPS.

[41] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[42] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[43] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[44] Song Han,et al. ADC: Automated Deep Compression and Acceleration with Reinforcement Learning , 2018, ArXiv.

[45] Jian Sun,et al. Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.