暂无分享,去创建一个
Blake Bordelon | Abdulkadir Canatar | Cengiz Pehlevan | C. Pehlevan | Blake Bordelon | Abdulkadir Canatar
[1] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[2] Zheng Ma,et al. Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks , 2019, Communications in Computational Physics.
[3] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[4] Zheng Ma,et al. Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks , 2019, ArXiv.
[5] Tomaso A. Poggio,et al. Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..
[6] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[7] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[8] Jaehoon Lee,et al. Neural Tangents: Fast and Easy Infinite Neural Networks in Python , 2019, ICLR.
[9] Zohar Ringel,et al. Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective , 2019, ArXiv.
[10] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.
[11] Mikhail Belkin,et al. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate , 2018, NeurIPS.
[12] Greg Yang,et al. A Fine-Grained Spectral Perspective on Neural Networks , 2019, ArXiv.
[13] HighWire Press. Philosophical transactions of the Royal Society of London. Series A, Containing papers of a mathematical or physical character , 1896 .
[14] Matthieu Wyart,et al. Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm , 2019, ArXiv.
[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[16] S. Kirkpatrick,et al. Solvable Model of a Spin-Glass , 1975 .
[17] M. Opper,et al. Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.
[18] Felipe Cucker,et al. Best Choices for Regularization Parameters in Learning Theory: On the Bias—Variance Problem , 2002, Found. Comput. Math..
[19] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[20] Julien Mairal,et al. On the Inductive Bias of Neural Tangent Kernels , 2019, NeurIPS.
[21] Milton Abramowitz,et al. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .
[22] G. Arfken. Mathematical Methods for Physicists , 1967 .
[23] G. Wahba. Spline models for observational data , 1990 .
[24] Yoshua Bengio,et al. On the Spectral Bias of Neural Networks , 2018, ICML.
[25] M. Mézard,et al. Spin Glass Theory And Beyond: An Introduction To The Replica Method And Its Applications , 1986 .
[26] Alexander J. Smola,et al. Regularization with Dot-Product Kernels , 2000, NIPS.
[27] Farhan Ali,et al. Flexibility in motor timing constrains the topology and dynamics of pattern generator circuits , 2015, bioRxiv.
[28] Zheng Ma,et al. Theory of the Frequency Principle for General Deep Neural Networks , 2019, CSIAM Transactions on Applied Mathematics.
[29] Manfred Opper,et al. General Bounds on Bayes Errors for Regression with Gaussian Processes , 1998, NIPS.
[30] M. Abramowitz,et al. Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .
[31] C. Frye,et al. Spherical Harmonics in p Dimensions , 2012, 1205.3548.
[32] Mikhail Belkin,et al. Does data interpolation contradict statistical optimality? , 2018, AISTATS.
[33] Peter Sollich. Gaussian Process Regression with Mismatched Models , 2001, NIPS.
[34] Zhi-Qin John Xu,et al. Training behavior of deep neural network in frequency domain , 2018, ICONIP.
[35] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[36] A SYMPTOTIC LEARNING CURVES OF KERNEL METHODS : EMPIRICAL DATA , 2019 .
[37] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .
[38] Peter Sollich,et al. Learning Curves for Gaussian Processes , 1998, NIPS.
[39] Yuan Xu,et al. Approximation Theory and Harmonic Analysis on Spheres and Balls , 2013 .
[40] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[41] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[42] J. Hubbard. Calculation of Partition Functions , 1959 .
[43] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[44] Yuan Cao,et al. Towards Understanding the Spectral Bias of Deep Learning , 2019, IJCAI.
[45] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[46] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[47] Peter Sollich,et al. Learning Curves for Gaussian Process Regression: Approximations and Bounds , 2001, Neural Computation.
[48] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[49] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[50] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .
[51] Giorgio Parisi,et al. Infinite Number of Order Parameters for Spin-Glasses , 1979 .
[52] S. Ganguli,et al. Statistical mechanics of complex neural systems and high dimensional data , 2013, 1301.7115.
[53] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[54] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..