On Implicit Filter Level Sparsity in Convolutional Neural Networks
暂无分享,去创建一个
[1] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.
[2] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[3] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[4] Matthew Botvinick,et al. On the importance of single directions for generalization , 2018, ICLR.
[5] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.
[6] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[7] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[8] Gregory J. Wolff,et al. Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.
[9] Richard Socher,et al. Improving Generalization Performance by Switching from Adam to SGD , 2017, ArXiv.
[10] Lucas Theis,et al. Faster gaze prediction with dense networks and Fisher pruning , 2018, ArXiv.
[11] Taiji Suzuki,et al. Adam Induces Implicit Weight Sparsity in Rectifier Neural Networks , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).
[12] Leonidas J. Guibas,et al. ObjectNet3D: A Large Scale Database for 3D Object Recognition , 2016, ECCV.
[13] Bolei Zhou,et al. Revisiting the Importance of Individual Units in CNNs via Ablation , 2018, ArXiv.
[14] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[15] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[16] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[17] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[18] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[19] James Zijun Wang,et al. Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers , 2018, ICLR.
[20] Mingjie Sun,et al. Rethinking the Value of Network Pruning , 2018, ICLR.
[21] R. Venkatesh Babu,et al. Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.
[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[23] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[24] Frank Hutter,et al. Fixing Weight Decay Regularization in Adam , 2017, ArXiv.
[25] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[26] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[27] Claus Nebauer,et al. Evaluation of convolutional neural networks for visual recognition , 1998, IEEE Trans. Neural Networks.
[28] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[29] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[30] Joseph Paul Cohen,et al. RandomOut: Using a convolutional gradient norm to rescue convolutional filters , 2016, 1602.05931.
[31] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .
[32] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[34] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[35] Rui Peng,et al. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.
[36] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.