暂无分享,去创建一个
[1] AI Koan,et al. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.
[2] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[3] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.
[4] Michael W. Mahoney,et al. Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning , 2018, J. Mach. Learn. Res..
[5] Mikhail Belkin,et al. Reconciling modern machine learning and the bias-variance trade-off , 2018, ArXiv.
[6] R. Couillet,et al. Random Matrix Methods for Wireless Communications , 2011 .
[7] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[8] Vinay Uday Prabhu. Kannada-MNIST: A new handwritten digits dataset for the Kannada language , 2019, ArXiv.
[9] Michael W. Mahoney,et al. Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks , 2019, SDM.
[10] Surya Ganguli,et al. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.
[11] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.
[12] Michael W. Mahoney,et al. Traditional and Heavy-Tailed Self Regularization in Neural Network Models , 2019, ICML.
[13] Zhenyu Liao,et al. The Dynamics of Learning: A Random Matrix Approach , 2018, ICML.
[14] Michael W. Mahoney,et al. Exact expressions for double descent and implicit regularization via surrogate random design , 2019, NeurIPS.
[15] Romain Couillet,et al. Concentration of Measure and Large Random Matrices with an application to Sample Covariance Matrices , 2018, 1805.08295.
[16] Lucas Benigni,et al. Eigenvalue distribution of nonlinear models of random matrices , 2019, ArXiv.
[17] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[18] Michael W. Mahoney,et al. Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior , 2017, ArXiv.
[19] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[20] Ameet Talwalkar,et al. On the Impact of Kernel Approximation on Learning Accuracy , 2010, AISTATS.
[21] D. Haussler,et al. Rigorous learning curve bounds from statistical mechanics , 1994, COLT '94.
[22] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[23] Zhenyu Liao,et al. A Random Matrix Approach to Neural Networks , 2017, ArXiv.
[24] S. Kak. Information, physics, and computation , 1996 .
[25] Roy D. Yates,et al. A Framework for Uplink Power Control in Cellular Radio Systems , 1995, IEEE J. Sel. Areas Commun..
[26] Romain Couillet,et al. Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures , 2020, ICML.
[27] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[28] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[29] Ameya Velingker,et al. Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees , 2018, ICML.
[30] Christopher K. I. Williams. Computing with Infinite Networks , 1996, NIPS.
[31] Christos Thrampoulidis,et al. A Model of Double Descent for High-dimensional Binary Linear Classification , 2019, ArXiv.
[32] Surya Ganguli,et al. The Emergence of Spectral Universality in Deep Networks , 2018, AISTATS.
[33] W. Hachem,et al. Deterministic equivalents for certain functionals of large random matrices , 2005, math/0507172.
[34] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[35] Andrea Montanari,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[36] Zhichao Wang,et al. Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks , 2020, NeurIPS.
[37] Surya Ganguli,et al. Statistical Mechanics of Deep Learning , 2020, Annual Review of Condensed Matter Physics.
[38] Romain Couillet,et al. Large Sample Covariance Matrices of Concentrated Vectors , 2018 .
[39] Andrea Montanari,et al. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve , 2019, Communications on Pure and Applied Mathematics.
[40] V. Marčenko,et al. DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .
[41] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.
[42] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.
[43] Lorenzo Rosasco,et al. Generalization Properties of Learning with Random Features , 2016, NIPS.
[44] Andrew Zisserman,et al. Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[45] Stefan Wager,et al. High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification , 2015, 1507.03003.
[46] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[47] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[48] Jeffrey Pennington,et al. Nonlinear random matrix theory for deep learning , 2019, NIPS.
[49] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[50] T. Watkin,et al. THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .
[51] L. Pastur. On Random Matrices Arising in Deep Neural Networks. Gaussian Case , 2020, 2001.06188.
[52] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[53] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[54] Zhenyu Liao,et al. On the Spectrum of Random Features Maps of High Dimensional Data , 2018, ICML.
[55] M. Ledoux. The concentration of measure phenomenon , 2001 .
[56] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[57] Michael W. Mahoney,et al. Statistical Mechanics Methods for Discovering Knowledge from Modern Production Quality Neural Networks , 2019, KDD.
[58] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.