Intersection of Parallels as an Early Stopping Criterion
暂无分享,去创建一个
[1] Percy Liang,et al. Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution , 2022, ICLR.
[2] Behnam Neyshabur,et al. Exploring the Limits of Large Scale Pre-training , 2021, ICLR.
[3] Javier Ruiz-Hidalgo,et al. Channel-Wise Early Stopping without a Validation Set via NNK Polytope Interpolation , 2021, 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[4] Patrick Thiran,et al. Disparity Between Batches as a Signal for Early Stopping , 2021, ECML/PKDD.
[5] Chi Jin,et al. A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network , 2021, COLT.
[6] M. de Rijke,et al. When Inverse Propensity Scoring does not Work: Affine Corrections for Unbiased Learning to Rank , 2020, CIKM.
[7] Andrea Montanari,et al. The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training , 2020, The Annals of Statistics.
[8] Mikhail Belkin,et al. Classification vs regression in overparameterized regimes: Does the loss function matter? , 2020, J. Mach. Learn. Res..
[9] Christos Thrampoulidis,et al. Analytic Study of Double Descent in Binary Classification: The Impact of Loss , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).
[10] Christos Thrampoulidis,et al. A Model of Double Descent for High-dimensional Binary Linear Classification , 2019, Information and Inference: A Journal of the IMA.
[11] M. de Rijke,et al. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions , 2019, SIGIR.
[12] Samet Oymak,et al. Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks , 2019, AISTATS.
[13] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.
[14] J. Zico Kolter,et al. Generalization in Deep Networks: The Role of Distance from Initialization , 2019, ArXiv.
[15] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[16] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[17] Wei Hu,et al. A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks , 2018, ICLR.
[18] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[19] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[20] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[21] Christoph Lassner,et al. Early Stopping without a Validation Set , 2017, ArXiv.
[22] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[23] Thorsten Joachims,et al. Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.
[24] Ryan P. Adams,et al. Early Stopping as Nonparametric Variational Inference , 2015, AISTATS.
[25] Tao Qin,et al. Introducing LETOR 4.0 Datasets , 2013, ArXiv.
[26] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.
[27] Yi Chang,et al. Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.
[28] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[29] Jong-Dae Kim,et al. Regularization Parameter Determination for Optical Flow Estimation using L-curve , 2007 .
[30] Tie-Yan Liu,et al. Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.
[31] Y. Yao,et al. On Early Stopping in Gradient Descent Learning , 2007 .
[32] J. Clarke,et al. Low-field SQUID MRI: To tune or not to tune? , 2006 .
[33] S. Gómez,et al. The triangle method for finding the corner of the L-curve , 2002 .
[34] R. Bapat,et al. The generalized Moore-Penrose inverse , 1992 .
[35] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.
[36] P. Hansen. Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion , 1987 .