暂无分享,去创建一个
[1] Yann LeCun,et al. Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks , 2018, ArXiv.
[2] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[3] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[4] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Erich Elsen,et al. Exploring Sparsity in Recurrent Neural Networks , 2017, ICLR.
[6] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[7] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[8] Roger B. Grosse,et al. Picking Winning Tickets Before Training by Preserving Gradient Flow , 2020, ICLR.
[9] Nathan Srebro,et al. Kernel and Rich Regimes in Overparametrized Models , 2019, COLT.
[10] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[11] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[12] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[13] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[14] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[15] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[16] Yue Wang,et al. Drawing early-bird tickets: Towards more efficient training of deep networks , 2019, ICLR.
[17] Andrew Gordon Wilson,et al. Deep Kernel Learning , 2015, AISTATS.
[18] Matthew Richardson,et al. Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.
[19] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[20] David Kappel,et al. Deep Rewiring: Training very sparse deep networks , 2017, ICLR.
[21] Daniel L. K. Yamins,et al. Pruning neural networks without any data by iteratively conserving synaptic flow , 2020, NeurIPS.
[22] Yiran Chen,et al. Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size , 2016, ArXiv.
[23] Gintare Karolina Dziugaite,et al. Pruning Neural Networks at Initialization: Why are We Missing the Mark? , 2020, ArXiv.
[24] Erich Elsen,et al. Rigging the Lottery: Making All Tickets Winners , 2020, ICML.
[25] Erich Elsen,et al. Fast Sparse ConvNets , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[27] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[28] Michael Carbin,et al. The Lottery Ticket Hypothesis: Training Pruned Neural Networks , 2018, ArXiv.
[29] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Jaehoon Lee,et al. Finite Versus Infinite Neural Networks: an Empirical Study , 2020, NeurIPS.
[31] Philip H. S. Torr,et al. SNIP: Single-shot Network Pruning based on Connection Sensitivity , 2018, ICLR.
[32] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[33] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).