Dynamic and Adaptive Threshold for DNN Compression from Scratch

Despite their great success, deep neural networks (DNN) are hard to deploy on devices with limited hardware like mobile phones because of massive parameters. Many methods have been proposed for DNN compression, i.e., to reduce the parameters of DNN models. However, almost all of them are based on reference models, which were firstly trained. In this paper, we propose an approach to perform DNN training and compression simultaneously. More concretely, a dynamic and adaptive threshold (DAT) framework is utilized to prune a DNN gradually by changing the pruning threshold during training. Experiments show that DAT can not only reach comparable or better compression rate almost without loss of accuracy than state-of-the-art DNN compression methods, but also beat DNN sparse training methods by a large margin.

[1]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[2]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[3]  Xuelong Li,et al.  Towards Convolutional Neural Networks Compression via Global Error Reconstruction , 2016, IJCAI.

[4]  Qi-Jun Zhang,et al.  Device and circuit-level modeling using neural networks with faster training based on network sparsity , 1997 .

[5]  Pushmeet Kohli,et al.  Memory Bounded Deep Convolutional Networks , 2014, ArXiv.

[6]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[7]  Masumi Ishikawa,et al.  Structural learning with forgetting , 1996, Neural Networks.

[8]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[9]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[10]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[11]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[12]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[13]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[14]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Sachin S. Talathi,et al.  Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.

[17]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[18]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[19]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[20]  Jian Sun,et al.  Efficient and accurate approximations of nonlinear convolutional networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[25]  Ming Yang,et al.  Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.