Embedding Differentiable Sparsity into Deep Neural Network

In this paper, we propose embedding sparsity into the structure of deep neural networks, where model parameters can be exactly zero during training with the stochastic gradient descent. Thus, it can learn the sparsified structure and the weights of networks simultaneously. The proposed approach can learn structured as well as unstructured sparsity.

[1]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[2]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[3]  Hassan Foroosh,et al.  Sparse Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sanguthevar Rajasekaran,et al.  AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters , 2019, NeurIPS.

[5]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[6]  Xin Dong,et al.  Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.

[7]  Rong Jin,et al.  Exclusive Lasso for Multi-task Feature Selection , 2010, AISTATS.

[8]  Mathieu Salzmann,et al.  Learning the Number of Neurons in Deep Networks , 2016, NIPS.

[9]  Sung Ju Hwang,et al.  Combined Group and Exclusive Sparsity for Deep Neural Networks , 2017, ICML.

[10]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[11]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[12]  Yognjin Lee,et al.  Differentiable Sparsification for Deep Neural Networks , 2019, ArXiv.

[13]  Gregory J. Wolff,et al.  Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[14]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[15]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[16]  Suyog Gupta,et al.  To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.