Fuzzy Pruning for Compression of Convolutional Neural Networks

Pruning can be used in Convolutional Neural Networks(CNNs) to reduce the consumption of computations. In the iteration of pruning, a fixed ratio of filters or weights are pruned after evaluating their importance according to a certain criterion, which has the risk of leading to mis-pruning when the distribution of the importance is imbalanced. We propose a strategy to assist pruning for CNNs at filter level. Firstly, in order to evaluate importance and non-importance about filters in the network, we design two fuzzy membership functions respectively, then we evaluate all filters by their membership function value, finally, α-cut set is employed to determine the ones to be pruned . Experimental results demonstrate the effectiveness of our strategy, since it can obtain more compression than the popular pruning algorithms at similar accuracy level.

[1]  Shuicheng Yan,et al.  Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods , 2016, ArXiv.

[2]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[3]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Lin Xu,et al.  Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.

[5]  Yurong Chen,et al.  Network Sketching: Exploiting Binary Structure in Deep CNNs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[7]  Wonyong Sung,et al.  Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..

[8]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Tara N. Sainath,et al.  Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[12]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[13]  Jianxin Wu,et al.  ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Song Han,et al.  A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding , 2015 .

[15]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[16]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[20]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[21]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[22]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[23]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[24]  Jianxin Wu,et al.  An Entropy-based Pruning Method for CNN Compression , 2017, ArXiv.