Characterizing Structural Regularities of Labeled Data in Overparameterized Models
暂无分享,去创建一个
[1] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[2] James L. McClelland,et al. On learning the past-tenses of English verbs: implicit rules or parallel distributed processing , 1986 .
[3] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[4] Richard Socher,et al. Improving Generalization Performance by Switching from Adam to SGD , 2017, ArXiv.
[5] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[6] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[7] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[8] Rajeev Rastogi,et al. Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.
[9] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[10] Mikhail Belkin,et al. Two models of double descent for weak features , 2019, SIAM J. Math. Data Sci..
[11] Nicholas Carlini,et al. Prototypical Examples in Deep Learning: Metrics, Characteristics, and Utility , 2018 .
[12] Michael C. Mozer,et al. Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning , 2018, NeurIPS.
[13] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.
[14] Vitaly Feldman,et al. Does learning require memorization? a short tale about a long tail , 2019, STOC.
[15] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD 2000.