Replica mean field theory for the generalisation gap of deep neural networks
暂无分享,去创建一个
[1] M. Mézard,et al. Spin Glass Theory And Beyond: An Introduction To The Replica Method And Its Applications , 1986 .
[2] E. Gardner. The space of interactions in neural network models , 1988 .
[3] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.
[4] M. Opper,et al. Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.
[5] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[6] Gábor Lugosi,et al. Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.
[7] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[8] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[9] Léon Bottou,et al. Making Vapnik–Chervonenkis Bounds Accurate , 2015 .
[10] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[12] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[14] F. Santambrogio. {Euclidean, metric, and Wasserstein} gradient flows: an overview , 2016, 1609.03890.
[15] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[16] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[17] Peter L. Bartlett,et al. Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..
[18] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[19] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.
[20] G. Biroli,et al. Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime , 2020, ICML.
[21] Marco Gherardi,et al. Beyond the storage capacity: data driven satisfiability transition , 2020, Physical review letters.
[22] Vittorio Erba,et al. Statistical learning theory of structured data. , 2020, Physical review. E.
[23] Blake Bordelon,et al. Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks , 2020, ICML.
[24] M. Lagomarsino,et al. Counting the learnable functions of geometrically structured data , 2020 .
[25] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[26] Lenka Zdeborová,et al. Understanding deep learning is also a job for physicists , 2020, Nature Physics.
[27] C. Pehlevan,et al. Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks , 2020, Nature Communications.
[28] Samy Bengio,et al. Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.
[29] Mikhail Belkin,et al. Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation , 2021, Acta Numerica.
[30] Andrea Montanari,et al. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve , 2019, Communications on Pure and Applied Mathematics.
[31] Carlo Baldassi,et al. Learning through atypical "phase transitions" in overparameterized neural networks , 2021, ArXiv.
[32] Critical properties of the SAT/UNSAT transitions in the classification problem of structured data , 2021, Journal of Statistical Mechanics: Theory and Experiment.
[33] Marco Gherardi,et al. Solvable Model for the Linear Separability of Structured Data , 2021, Entropy.
[34] R. Zecchina,et al. Unveiling the structure of wide flat minima in neural networks , 2021, Physical review letters.
[35] Surya Ganguli,et al. A theory of high dimensional regression with arbitrary correlations between input features and target functions: sample complexity, multiple descent curves and a hierarchy of phase transitions , 2021, ICML.