暂无分享,去创建一个
[1] Stefan Wager,et al. High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification , 2015, 1507.03003.
[2] A Tikhonov,et al. Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .
[3] Prateek Jain,et al. A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares) , 2017, FSTTCS.
[4] Vladimir Braverman,et al. Benign Overfitting of Constant-Stepsize SGD for Linear Regression , 2021, COLT.
[5] Prateek Jain,et al. Parallelizing Stochastic Gradient Descent for Least Squares Regression: Mini-batching, Averaging, and Model Misspecification , 2016, J. Mach. Learn. Res..
[6] A. Tsigler,et al. Benign overfitting in ridge regression , 2020 .
[7] Dmitry Kobak,et al. The Optimal Ridge Penalty for Real-world High-dimensional Data Can Be Zero or Negative due to the Implicit Ridge Regularization , 2020, J. Mach. Learn. Res..
[8] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[9] Sham M. Kakade,et al. Random Design Analysis of Ridge Regression , 2012, COLT.
[10] Francis R. Bach,et al. Averaged Least-Mean-Squares: Bias-Variance Trade-offs and Optimal Sampling Distributions , 2015, AISTATS.
[11] R. R. Bahadur. Some Limit Theorems in Statistics , 1987 .
[12] Andrea Montanari,et al. Surprises in High-Dimensional Ridgeless Least Squares Interpolation , 2019, Annals of statistics.
[13] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[14] Sham M. Kakade,et al. A risk comparison of ordinary least squares vs ridge regression , 2011, J. Mach. Learn. Res..
[15] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[16] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[17] Ji Xu,et al. On the number of variables to use in principal component regression , 2019 .
[18] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[19] J. Zico Kolter,et al. A Continuous-Time View of Early Stopping for Least Squares Regression , 2018, AISTATS.
[20] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[21] Dimitris Achlioptas,et al. Bad Global Minima Exist and SGD Can Reach Them , 2019, NeurIPS.
[22] Pradeep Ravikumar,et al. Connecting Optimization and Regularization Paths , 2018, NeurIPS.
[23] R. R. Bahadur. Rates of Convergence of Estimates and Test Statistics , 1967 .
[24] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[25] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[26] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[27] Ji Xu,et al. On the Optimal Weighted $\ell_2$ Regularization in Overparameterized Linear Regression , 2020, NeurIPS.
[28] Nadav Cohen,et al. Implicit Regularization in Deep Learning May Not Be Explainable by Norms , 2020, NeurIPS.
[29] Roi Livni,et al. Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study , 2020, NeurIPS.