A Unified Approximation Framework for Deep Neural Networks

Deep neural networks (DNNs) have achieved significant success in a variety of real world applications. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model size and the intensive computation. To address this issue, various compression and acceleration techniques have been investigated, among which low-rank filters and sparse filters are heavily studied. In this paper we propose a unified framework to compress the convolutional neural networks by combining these two strategies, while taking the nonlinear activation into consideration. The filer of a layer is approximated by the sum of a sparse component and a low-rank component, both of which are in favor of model compression. Especially, we constrain the sparse component to be structured sparse which facilitates acceleration. The performance of the network is retained by minimizing the reconstruction error of the feature maps after activation of each layer, using the alternating direction method of multipliers (ADMM). The experimental results show that our proposed approach can compress VGG-16 and AlexNet by over 4X. In addition, 2.2X and 1.1X speedup are achieved on VGG-16 and AlexNet, respectively, at a cost of less increase on error rate.

[1]  Mathieu Salzmann,et al.  Compression-aware Training of Deep Networks , 2017, NIPS.

[2]  Mathieu Salzmann,et al.  Learning the Number of Neurons in Deep Networks , 2016, NIPS.

[3]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Wenjian Yu,et al.  Fast Training and Model Compression of Gated RNNs via Singular Value Decomposition , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[5]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[6]  Jian Sun,et al.  Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Larry S. Davis,et al.  NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[9]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[10]  Xiaogang Wang,et al.  Convolutional neural networks with low-rank regularization , 2015, ICLR.

[11]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[12]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Bingsheng He,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2014, Mathematical Programming.

[14]  Eunhyeok Park,et al.  Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications , 2015, ICLR.

[15]  Ivan V. Oseledets,et al.  Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[16]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[17]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[18]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[21]  Hongxia Jin,et al.  Network Approximation using Tensor Sketching , 2018, IJCAI.

[22]  Jian Sun,et al.  Efficient and accurate approximations of nonlinear convolutional networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[24]  Alexander Novikov,et al.  Tensorizing Neural Networks , 2015, NIPS.

[25]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Hao Zhou,et al.  Less Is More: Towards Compact CNNs , 2016, ECCV.

[27]  Dacheng Tao,et al.  On Compressing Deep Models by Low Rank and Sparse Decomposition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Andrew Zisserman,et al.  Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.

[29]  Cong Xu,et al.  Coordinating Filters for Faster Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[32]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .