How rotational invariance of common kernels prevents generalization in high dimensions
暂无分享,去创建一个
[1] Martin J. Wainwright,et al. High-Dimensional Statistics , 2019 .
[2] Yoram Singer,et al. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity , 2016, NIPS.
[3] Kh. D. Ikramov,et al. Conditionally definite matrices , 2000 .
[4] Stefan Wager,et al. High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification , 2015, 1507.03003.
[5] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[6] Andrea Montanari,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[7] Sara van de Geer,et al. Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .
[8] Andrea Montanari,et al. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve , 2019, Communications on Pure and Applied Mathematics.
[9] C. Berg,et al. Harmonic Analysis on Semigroups , 1984 .
[10] Mikhail Belkin,et al. Classification vs regression in overparameterized regimes: Does the loss function matter? , 2020, J. Mach. Learn. Res..
[11] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[12] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[13] R. Getoor,et al. Some theorems on stable processes , 1960 .
[14] Noureddine El Karoui,et al. The spectrum of kernel random matrices , 2010, 1001.0492.
[15] Andrea Montanari,et al. Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration , 2021, Applied and Computational Harmonic Analysis.
[16] Mikhail Belkin,et al. Does data interpolation contradict statistical optimality? , 2018, AISTATS.
[17] David J. C. MacKay,et al. BAYESIAN NON-LINEAR MODELING FOR THE PREDICTION COMPETITION , 1996 .
[18] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[19] Alexander J. Smola,et al. Learning with kernels , 1998 .
[20] Ingo Steinwart,et al. Consistency and robustness of kernel-based regression in convex risk minimization , 2007, 0709.0626.
[21] G. Wahba. Spline models for observational data , 1990 .
[22] T. Driscoll,et al. Interpolation in the limit of increasingly flat radial basis functions , 2002 .
[23] Martin J. Wainwright,et al. Kernel Feature Selection via Conditional Covariance Minimization , 2017, NIPS.
[24] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[25] V. Milman,et al. Asymptotic Theory Of Finite Dimensional Normed Spaces , 1986 .
[26] Tengyuan Liang,et al. On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels , 2019, COLT.
[27] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[28] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[29] Zhenyu Liao,et al. Kernel regression in high dimension: Refined analysis beyond double descent , 2020, AISTATS.
[30] Charles A. Micchelli,et al. On Convergence of Flat Multivariate Interpolation by Translation Kernels with Finite Smoothness , 2014 .
[31] Jonathan Ragan-Kelley,et al. Neural Kernels Without Tangents , 2020, ICML.
[32] Andrea Montanari,et al. When do neural networks outperform kernel methods? , 2020, NeurIPS.
[33] David Mease,et al. Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers , 2015, J. Mach. Learn. Res..
[34] Andrea Montanari,et al. Learning with invariances in random features and kernel models , 2021, COLT.
[35] B. Fornberg,et al. Theoretical and computational aspects of multivariate interpolation with increasingly flat radial basis functions , 2003 .
[36] M. Ledoux. The concentration of measure phenomenon , 2001 .
[37] Jaehoon Lee,et al. Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.
[38] Dan A. Simovici,et al. Bayesian Learning , 2019, Variational Bayesian Learning Theory.
[39] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.
[40] R. Basri,et al. On the Similarity between the Laplace and Neural Tangent Kernels , 2020, NeurIPS.
[41] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[42] R. Schaback. Multivariate Interpolation by Polynomials and Radial Basis Functions , 2005 .