Neural Network Compression and Acceleration by Federated Pruning

In recent years, channel pruning is one of the important methods for deep model compression. But the resulting model still has tremendous redundant feature maps. In this paper, we propose a novel method, namely federated pruning algorithm, to achieve narrower model with negligible performance degradation. Different from many existing approaches, the federated pruning algorithm removes all filters in the pre-trained model together with their connecting feature map by combining the weights with the importance of the channels, rather than pruning the network in terms of a single criterion. Finally, we fine-tune the resulting model to restore network performance. Extensive experiments demonstrate the effectiveness of federated pruning algorithm. VGG-19 network pruned by federated pruning algorithm on CIFAR-10 achieves 92.5% reduction in total parameters and \(13.58\times \) compression ratio with only 0.23% decrease in accuracy. Meanwhile, tested on SVHN, VGG-19 achieves 94.5% reduction in total parameters and \(18.01\times \) compression ratio with only 0.43% decrease in accuracy.

[1]  Yan Song,et al.  Distributed Attention-Based Temporal Convolutional Network for Remaining Useful Life Prediction , 2021, IEEE Internet of Things Journal.

[2]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[4]  Meikang Qiu,et al.  Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems , 2009, TODE.

[5]  Lin Xu,et al.  Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.

[6]  Zhi Chen,et al.  Data Allocation for Hybrid Memory With Genetic Algorithm , 2015, IEEE Transactions on Emerging Topics in Computing.

[7]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[8]  Zhong Ning,et al.  3DACN: 3D Augmented convolutional network for time series data , 2020, Inf. Sci..

[9]  Meikang Qiu,et al.  Security protection and checking for embedded system integration against buffer overflow attacks via hardware/software , 2006, IEEE Transactions on Computers.

[10]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[12]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[14]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Jing Fan,et al.  Localized Traffic Sign Detection with Multi-scale Deconvolution Networks , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[16]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[17]  Zhihui Lu,et al.  An efficient key distribution system for data fusion in V2X heterogeneous networks , 2019, Inf. Fusion.

[18]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[19]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[20]  Rui Peng,et al.  Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.