Auto-Balanced Filter Pruning for Efficient Convolutional Neural Networks

In recent years considerable research efforts have been devoted to compression techniques of convolutional neural networks (CNNs). Many works so far have focused on CNN connection pruning methods which produce sparse parameter tensors in convolutional or fully-connected layers. It has been demonstrated in several studies that even simple methods can effectively eliminate connections of a CNN. However, since these methods make parameter tensors just sparser but no smaller, the compression may not transfer directly to acceleration without support from specially designed hardware. In this paper, we propose an iterative approach named Auto-balanced Filter Pruning, where we pre-train the network in an innovative auto-balanced way to transfer the representational capacity of its convolutional layers to a fraction of the filters, prune the redundant ones, then re-train it to restore the accuracy. In this way, a smaller version of the original network is learned and the floating-point operations (FLOPs) are reduced. By applying this method on several common CNNs, we show that a large portion of the filters can be discarded without obvious accuracy drop, leading to significant reduction of computational burdens. Concretely, we reduce the inference cost of LeNet-5 on MNIST, VGG-16 and ResNet-56 on CIFAR-10 by 95.1%, 79.7% and 60.9%, respectively. Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

[1]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[2]  Hao Zhou,et al.  Less Is More: Towards Compact CNNs , 2016, ECCV.

[3]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[4]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[5]  Bhiksha Raj,et al.  The Incredible Shrinking Neural Network: New Perspectives on Learning Representations Through The Lens of Pruning , 2017, ArXiv.

[6]  Rui Peng,et al.  Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[10]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[11]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[12]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[13]  Victor S. Lempitsky,et al.  Fast ConvNets Using Group-Wise Brain Damage , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[16]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[17]  Bin Yu,et al.  Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning , 2017, ArXiv.

[18]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[19]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[20]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[21]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[24]  Mathieu Salzmann,et al.  Learning the Number of Neurons in Deep Networks , 2016, NIPS.

[25]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[27]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.