Dynamic Thresholding for Learning Sparse Neural Networks

This paper proposes a method called Dynamic Thresholding, which can dynamically adjust the size of deep neural networks by removing redundant weights during training. The key idea is to learn the pruning threshold values applied for weight removal, instead of fixing them manually. We approximate a discontinuous pruning function with a differentiable form involving the thresholds, which can be optimized via the gradient descent learning procedure. While previous sparsity-promoting methods perform pruning with manually determined thresholds, our method can directly obtain a sparse network at each training iteration and thus does not need a trial-and-error process to choose proper threshold values. We examine the performance of the proposed method on the image classification tasks including MNIST, CIFAR10, and ImageNet. It is demonstrated that our method achieves competitive results with existing methods and, at the same time, requires smaller numbers of training iterations in comparison to other approaches based on train-pruneretrain cycles.

[1]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[2]  Yurong Chen,et al.  Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[3]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[4]  David Kappel,et al.  Deep Rewiring: Training very sparse deep networks , 2017, ICLR.

[5]  Chen Lin,et al.  Synaptic Strength For Convolutional Neural Network , 2018, NeurIPS.

[6]  Miguel Á. Carreira-Perpiñán,et al.  "Learning-Compression" Algorithms for Neural Net Pruning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Suya You,et al.  Learning to Prune Filters in Convolutional Neural Networks , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[8]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[9]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[10]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[11]  Jing Liu,et al.  Discrimination-aware Channel Pruning for Deep Neural Networks , 2018, NeurIPS.

[12]  Jianxin Wu,et al.  ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Dmitry P. Vetrov,et al.  Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[14]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[15]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[16]  R. Venkatesh Babu,et al.  Training Sparse Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Ruiqin Xiong,et al.  Frequency-Domain Dynamic Pruning for Convolutional Neural Networks , 2018, NeurIPS.

[18]  Gianluca Francini,et al.  Learning Sparse Neural Networks via Sensitivity-Driven Regularization , 2018, NeurIPS.

[19]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[20]  Max Welling,et al.  Learning Sparse Neural Networks through L0 Regularization , 2017, ICLR.

[21]  Jungong Han,et al.  Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).