Ridge Regression with Over-parametrized Two-Layer Networks Converge to Ridgelet Spectrum
暂无分享,去创建一个
[1] Gordon Wetzstein,et al. Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.
[2] Behnam Neyshabur,et al. Implicit Regularization in Deep Learning , 2017, ArXiv.
[3] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[4] Grant M. Rotskoff,et al. Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks , 2018, NeurIPS.
[5] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[6] Maxim Raginsky,et al. A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics , 2020, ArXiv.
[7] Matthias Hein,et al. The Loss Surface of Deep and Wide Neural Networks , 2017, ICML.
[8] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[9] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[10] Stevan Pilipovic,et al. The Ridgelet transform of distributions , 2013, 1306.2024.
[11] Andrea Montanari,et al. Limitations of Lazy Training of Two-layers Neural Networks , 2019, NeurIPS.
[12] Taiji Suzuki,et al. Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics , 2020, NeurIPS.
[13] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[14] Surya Ganguli,et al. Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.
[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[16] Yann LeCun,et al. Open Problem: The landscape of the loss surfaces of multilayer networks , 2015, COLT.
[17] Arthur Gretton,et al. Maximum Mean Discrepancy Gradient Flow , 2019, NeurIPS.
[18] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[19] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[20] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[21] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[22] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.
[23] Noboru Murata,et al. Sampling Hidden Parameters from Oracle Distribution , 2014, ICANN.
[24] S. Helgason. Integral Geometry and Radon Transforms , 2010 .
[25] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.
[26] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[27] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[28] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[29] Noboru Murata,et al. The global optimum of shallow neural network is attained by ridgelet transform , 2018 .
[30] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[31] J. Kuelbs. Probability on Banach spaces , 1978 .
[32] Suvrit Sra,et al. Global optimality conditions for deep neural networks , 2017, ICLR.
[33] Gilad Yehudai,et al. On the Power and Limitations of Random Features for Understanding Neural Networks , 2019, NeurIPS.
[34] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[35] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[36] Boris Rubin,et al. The Calderón reproducing formula, windowed X-ray transforms, and radon transforms in LP-spaces , 1998 .
[37] Minh N. Do,et al. The finite ridgelet transform for image representation , 2003, IEEE Trans. Image Process..
[38] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[39] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[40] Surya Ganguli,et al. The Emergence of Spectral Universality in Deep Networks , 2018, AISTATS.
[41] Taiji Suzuki,et al. Stochastic Particle Gradient Descent for Infinite Ensembles , 2017, ArXiv.
[42] Haihao Lu,et al. Depth Creates No Bad Local Minima , 2017, ArXiv.
[43] Noboru Murata,et al. An Integral Representation of Functions Using Three-layered Networks and Their Approximation Bounds , 1996, Neural Networks.