暂无分享,去创建一个
[1] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[2] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[3] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[4] Percy Liang,et al. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.
[5] M. Opper,et al. Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.
[6] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[7] Julien Mairal,et al. On the Inductive Bias of Neural Tangent Kernels , 2019, NeurIPS.
[8] Boaz Barak,et al. Deep double descent: where bigger models and more data hurt , 2019, ICLR.
[9] Jaehoon Lee,et al. Neural Tangents: Fast and Easy Infinite Neural Networks in Python , 2019, ICLR.
[10] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.
[11] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[12] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.
[13] Kouichi Sakurai,et al. One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.
[14] Tengyu Ma,et al. Optimal Regularization Can Mitigate Double Descent , 2021, ICLR.
[15] Aleksander Madry,et al. Robustness May Be at Odds with Accuracy , 2018, ICLR.
[16] Amin Karbasi,et al. Multiple Descent: Design Your Own Generalization Curve , 2020, NeurIPS.
[17] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[18] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[19] S. Ganguli,et al. Statistical mechanics of complex neural systems and high dimensional data , 2013, 1301.7115.
[20] Blake Bordelon,et al. Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks , 2020, ICML.
[21] Wouter M. Kouw,et al. A Review of Domain Adaptation without Target Labels , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[22] Tengyuan Liang,et al. On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels , 2019, COLT.
[23] Tomaso A. Poggio,et al. Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..
[24] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[25] Le Song,et al. Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.
[26] Aleksander Madry,et al. Exploring the Landscape of Spatial Robustness , 2017, ICML.
[27] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[28] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[29] Yaser S. Abu-Mostafa,et al. Mismatched Training and Test Distributions Can Outperform Matched Ones , 2015, Neural Computation.
[30] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[31] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.
[32] Levent Sagun,et al. Triple descent and the two kinds of overfitting: where and why do they appear? , 2020, NeurIPS.
[33] A. Wald. Statistical Decision Functions Which Minimize the Maximum Risk , 1945 .
[34] Preetum Nakkiran,et al. More Data Can Hurt for Linear Regression: Sample-wise Double Descent , 2019, ArXiv.
[35] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.
[36] M. Opper,et al. On the ability of the optimal perceptron to generalise , 1990 .
[37] Ievgen Redko,et al. Advances in Domain Adaptation Theory , 2019 .
[38] Aleksander Madry,et al. Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.
[39] Martin Arjovsky. Out of Distribution Generalization in Machine Learning , 2021, ArXiv.
[40] Ken-ichi Kawarabayashi,et al. How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks , 2020, ICLR.
[41] Percy Liang,et al. An Investigation of Why Overparameterization Exacerbates Spurious Correlations , 2020, ICML.
[42] Yuan Xu,et al. Approximation Theory and Harmonic Analysis on Spheres and Balls , 2013 .
[43] Aaron C. Courville,et al. Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.
[44] Andrea Montanari,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[45] J. Hertz,et al. Generalization in a linear perceptron in the presence of noise , 1992 .
[46] Blake Bordelon,et al. Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks , 2020 .
[47] Florent Krzakala,et al. Capturing the learning curves of generic features maps for realistic data sets with a teacher-student model , 2021, ArXiv.
[48] A. Cavagna,et al. Spin-glass theory for pedestrians , 2005, cond-mat/0505032.
[49] M. Mézard,et al. Spin Glass Theory And Beyond: An Introduction To The Replica Method And Its Applications , 1986 .
[50] Florent Krzakala,et al. Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime , 2020, ICML.
[51] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[52] G. Wahba. Spline models for observational data , 1990 .
[53] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.
[54] J. Hertz,et al. Phase transitions in simple learning , 1989 .