暂无分享,去创建一个
[1] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[2] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[3] William Yang Wang,et al. Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications , 2018, ArXiv.
[4] Matthias Hein,et al. Optimization Landscape and Expressivity of Deep CNNs , 2017, ICML.
[5] Shun-ichi Amari,et al. Universal statistics of Fisher information in deep neural networks: mean field approach , 2018, AISTATS.
[6] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[7] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[8] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[9] Surya Ganguli,et al. Deep Information Propagation , 2016, ICLR.
[10] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[11] Greg Yang,et al. Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.
[12] Samuel S. Schoenholz,et al. Mean Field Residual Networks: On the Edge of Chaos , 2017, NIPS.
[13] Masayoshi Kubo,et al. Implicit Regularization in Over-parameterized Neural Networks , 2019, ArXiv.
[14] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.
[15] Yuandong Tian,et al. Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima , 2017, ICML.
[16] Arnaud Doucet,et al. On the Impact of the Activation Function on Deep Neural Networks Training , 2019, ICML.
[17] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[18] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[19] Wenqing Hu,et al. On the diffusion approximation of nonconvex stochastic gradient descent , 2017, Annals of Mathematical Sciences and Applications.
[20] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[21] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[22] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[23] E Weinan,et al. Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms , 2015, ICML.