暂无分享,去创建一个
[1] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[2] Xin Wang,et al. Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization , 2019, ICML.
[3] Niraj K. Jha,et al. Grow and Prune Compact, Fast, and Accurate LSTMs , 2018, IEEE Transactions on Computers.
[4] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[6] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[7] Ehud D. Karnin,et al. A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.
[8] Masumi Ishikawa,et al. Structural learning with forgetting , 1996, Neural Networks.
[9] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.
[10] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[11] Max Welling,et al. Bayesian Compression for Deep Learning , 2017, NIPS.
[12] Max Welling,et al. Soft Weight-Sharing for Neural Network Compression , 2017, ICLR.
[13] Xin Dong,et al. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.
[14] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[15] Jason Yosinski,et al. Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask , 2019, NeurIPS.
[16] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[17] Diederik P. Kingma,et al. GPU Kernels for Block-Sparse Weights , 2017 .
[18] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[19] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[20] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[21] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[22] Niraj K. Jha,et al. NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm , 2017, IEEE Transactions on Computers.
[23] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[24] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[25] Philip H. S. Torr,et al. SNIP: Single-shot Network Pruning based on Connection Sensitivity , 2018, ICLR.
[26] Erich Elsen,et al. Exploring Sparsity in Recurrent Neural Networks , 2017, ICLR.
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[29] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.
[30] David Kappel,et al. Deep Rewiring: Training very sparse deep networks , 2017, ICLR.
[31] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[32] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[33] J. Kaas,et al. Connectivity-driven white matter scaling and folding in primate cerebral cortex , 2010, Proceedings of the National Academy of Sciences.
[34] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[35] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[36] Yves Chauvin,et al. A Back-Propagation Algorithm with Optimal Use of Hidden Units , 1988, NIPS.
[37] Miguel Á. Carreira-Perpiñán,et al. "Learning-Compression" Algorithms for Neural Net Pruning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[38] Max Welling,et al. Learning Sparse Neural Networks through L0 Regularization , 2017, ICLR.
[39] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[40] Gintare Karolina Dziugaite,et al. The Lottery Ticket Hypothesis at Scale , 2019, ArXiv.
[41] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.