暂无分享,去创建一个
[1] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[2] A solution of an integral equation , 1969 .
[3] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[4] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[5] Roman Vershynin,et al. Four lectures on probabilistic methods for data science , 2016, IAS/Park City Mathematics Series.
[6] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[7] Matus Telgarsky,et al. The implicit bias of gradient descent on nonseparable data , 2019, COLT.
[8] Yoshua Bengio,et al. On the Spectral Bias of Neural Networks , 2018, ICML.
[9] Samet Oymak,et al. Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path? , 2018, ICML.
[10] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[11] Christopher M. Bishop,et al. Regularization and complexity control in feed-forward networks , 1995 .
[12] Felix Abramovich,et al. Improved inference in nonparametric regression using Lk-smoothing splines , 1996 .
[13] David Rolnick,et al. Complexity of Linear Regions in Deep Networks , 2019, ICML.
[14] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.
[15] Zhanxing Zhu,et al. Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes , 2017, ArXiv.
[16] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[17] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[18] Ankit B. Patel,et al. A Functional Characterization of Randomly Initialized Gradient Descent in Deep ReLU Networks , 2019 .
[19] Radford M. Neal. Priors for Infinite Networks , 1996 .
[20] G. Lo,et al. Weak Convergence (IA). Sequences of Random Vectors , 2016, 1610.05415.
[21] Josef Teichmann,et al. How implicit regularization of Neural Networks affects the learned function - Part I , 2019, ArXiv.
[22] Zheng Ma,et al. A type of generalization error induced by initialization in deep neural networks , 2019, MSML.
[23] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[24] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[25] Christopher Holmes,et al. Spatially adaptive smoothing splines , 2006 .
[26] Sylvain Gelly,et al. Gradient Descent Quantizes ReLU Network Features , 2018, ArXiv.
[27] Nathan Srebro,et al. How do infinite width bounded norm networks look in function space? , 2019, COLT.
[28] Nathan Srebro,et al. A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case , 2019, ICLR.
[29] Robert D. Nowak,et al. Minimum "Norm" Neural Networks are Splines , 2019, ArXiv.
[30] Greg Yang,et al. Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.
[31] J. L. Walsh,et al. The theory of splines and their applications , 1969 .
[32] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[33] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[34] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[35] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[36] Francis Bach,et al. Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss , 2020, COLT.
[37] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.