Stabilizing the Lottery Ticket Hypothesis
暂无分享,去创建一个
Gintare Karolina Dziugaite | Daniel M. Roy | Michael Carbin | Jonathan Frankle | G. Dziugaite | Michael Carbin | Jonathan Frankle
[1] Mattan Erez,et al. PruneTrain: fast neural network training by dynamic sparse model reconfiguration , 2019, SC.
[2] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[3] Michael Carbin,et al. The Lottery Ticket Hypothesis: Training Pruned Neural Networks , 2018, ArXiv.
[4] Gintare Karolina Dziugaite,et al. Linear Mode Connectivity and the Lottery Ticket Hypothesis , 2019, ICML.
[5] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[6] Mingjie Sun,et al. Rethinking the Value of Network Pruning , 2018, ICLR.
[7] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.
[8] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[9] R. Venkatesh Babu,et al. Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.
[10] Xin Wang,et al. Dynamic parameter reallocation improves trainability of deep convolutional networks , 2018 .
[11] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[12] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Xin Dong,et al. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.
[14] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[15] Rui Peng,et al. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.
[16] J. Zico Kolter,et al. Uniform convergence may be unable to explain generalization in deep learning , 2019, NeurIPS.
[17] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[18] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[19] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[20] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[21] David Kappel,et al. Deep Rewiring: Training very sparse deep networks , 2017, ICLR.
[22] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[23] Jianxin Wu,et al. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[24] Amnon Shashua,et al. Inductive Bias of Deep Convolutional Networks through Pooling Geometry , 2016, ICLR.
[25] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[26] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[27] Max Welling,et al. Learning Sparse Neural Networks through L0 Regularization , 2017, ICLR.
[28] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[29] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[30] Philip H. S. Torr,et al. SNIP: Single-shot Network Pruning based on Connection Sensitivity , 2018, ICLR.
[31] Erich Elsen,et al. Exploring Sparsity in Recurrent Neural Networks , 2017, ICLR.
[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[34] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).