The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
暂无分享,去创建一个
[1] M. Wainwright,et al. Optimally tackling covariate shift in RKHS-based nonparametric regression , 2022, ArXiv.
[2] M. Wainwright,et al. A new similarity measure for covariate shift with applications to nonparametric regression , 2022, ICML.
[3] S. Kakade,et al. Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression , 2021, ICML.
[4] Dean P. Foster,et al. The Benefits of Implicit Regularization from SGD in Least Squares Problems , 2021, NeurIPS.
[5] Qi Lei,et al. Near-Optimal Linear Regression under Distribution Shift , 2021, ICML.
[6] Vladimir Braverman,et al. Benign Overfitting of Constant-Stepsize SGD for Linear Regression , 2021, COLT.
[7] Nicolas Flammarion,et al. Last iterate convergence of SGD for Least-Squares in the Interpolation regime , 2021, NeurIPS.
[8] A. Tsigler,et al. Benign overfitting in ridge regression , 2020 .
[9] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.
[10] Steve Hanneke,et al. On the Value of Target Data in Transfer Learning , 2020, NeurIPS.
[11] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.
[12] Sham M. Kakade,et al. The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure , 2019, NeurIPS.
[13] Yifan Wu,et al. Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment , 2019, ICML.
[14] Mehryar Mohri,et al. Adaptation Based on Generalized Discrepancy , 2019, J. Mach. Learn. Res..
[15] Samory Kpotufe,et al. Marginal Singularity, and the Benefits of Labels in Covariate-Shift , 2018, COLT.
[16] Prateek Jain,et al. A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares) , 2017, FSTTCS.
[17] Prateek Jain,et al. Parallelizing Stochastic Gradient Descent for Least Squares Regression: Mini-batching, Averaging, and Model Misspecification , 2016, J. Mach. Learn. Res..
[18] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[19] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[21] Mehryar Mohri,et al. Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..
[22] François Laviolette,et al. A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers , 2013, ICML.
[23] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[24] Mehryar Mohri,et al. New Analysis and Algorithm for Learning with Drifting Distributions , 2012, ALT.
[25] Motoaki Kawanabe,et al. Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.
[26] Yishay Mansour,et al. Learning Bounds for Importance Weighting , 2010, NIPS.
[27] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[28] Tyler Lu,et al. Impossibility Theorems for Domain Adaptation , 2010, AISTATS.
[29] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.
[30] Yishay Mansour,et al. Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.
[31] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[32] H. Shimodaira,et al. Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .
[33] R. R. Bahadur. Some Limit Theorems in Statistics , 1987 .
[34] R. R. Bahadur. Rates of Convergence of Estimates and Test Statistics , 1967 .