Beyond neural scaling laws: beating power law scaling via data pruning
暂无分享,去创建一个
[1] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[2] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[3] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[4] Luca M. Schulze Buschoff,et al. Trivial or impossible - dichotomous data difficulty masks model differences (on ImageNet and beyond) , 2021, ICLR.
[5] Alexander Kolesnikov,et al. Scaling Vision Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] C. Farabet,et al. Training Data Subset Search With Ensemble Active Learning , 2019, IEEE Transactions on Intelligent Transportation Systems.
[7] Utkarsh Sharma. Scaling Laws from the Data Manifold Dimension , 2022, J. Mach. Learn. Res..
[8] Wojciech Czaja,et al. Active Learning at the ImageNet Scale , 2021, ArXiv.
[9] Jonathan S. Rosenfeld. Scaling Laws for Deep Learning , 2021, ArXiv.
[10] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[11] G. Dziugaite,et al. Deep Learning on a Data Diet: Finding Important Examples Early in Training , 2021, NeurIPS.
[12] Yair Carmon,et al. Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization , 2021, ICML.
[13] Li Fei-Fei,et al. Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering , 2021, ACL.
[14] M. Bethge,et al. Partial success in closing the gap between human and machine vision , 2021, NeurIPS.
[15] Andrea Vedaldi,et al. PASS: An ImageNet replacement for self-supervised pretraining without humans , 2021, NeurIPS Datasets and Benchmarks.
[16] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[17] Jaehoon Lee,et al. Explaining neural scaling laws , 2021, Proceedings of the National Academy of Sciences of the United States of America.
[18] Tom Henighan,et al. Scaling Laws for Transfer , 2021, ArXiv.
[19] Stefano Soatto,et al. Estimating informativeness of samples with Smooth Unique Information , 2021, ICLR.
[20] Prafulla Dhariwal,et al. Data and Parameter Scaling Laws for Neural Machine Translation , 2021 .
[21] Mark Chen,et al. Scaling Laws for Autoregressive Generative Modeling , 2020, ArXiv.
[22] Vitaly Feldman,et al. What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation , 2020, NeurIPS.
[23] Felix A. Wichmann,et al. Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency , 2020, NeurIPS.
[24] Julien Mairal,et al. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.
[25] Surya Ganguli,et al. Statistical Mechanics of Deep Learning , 2020, Annual Review of Condensed Matter Physics.
[26] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[27] Fei-Fei Li,et al. Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy , 2019, FAT*.
[28] Luca Saglietti,et al. Large deviations for the perceptron model and consequences for active learning , 2019, MSML.
[29] Jonathan S. Rosenfeld,et al. A Constructive Prediction of the Generalization Error Across Scales , 2019, ICLR.
[30] Baharan Mirzasoleiman,et al. Coresets for Data-efficient Training of Machine Learning Models , 2019, ICML.
[31] Eric P. Xing,et al. Learning Robust Global Representations by Penalizing Local Predictive Power , 2019, NeurIPS.
[32] Taghi M. Khoshgoftaar,et al. Survey on deep learning with class imbalance , 2019, J. Big Data.
[33] Hai-Jun Zhou,et al. Active online learning in the binary perceptron problem , 2019, Communications in Theoretical Physics.
[34] Hossein Mobahi,et al. Semantic Redundancies in Image-Classification Datasets: The 10% You Don't Need , 2019, ArXiv.
[35] Yoshua Bengio,et al. An Empirical Study of Example Forgetting during Deep Neural Network Learning , 2018, ICLR.
[36] Matthias Bethge,et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.
[37] Matthias Bethge,et al. Generalisation in humans and deep neural networks , 2018, NeurIPS.
[38] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.
[39] Silvio Savarese,et al. Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.
[40] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[41] Matthias Bethge,et al. Methods and measurements to compare men against machines , 2017, HVEI.
[42] Florent Krzakala,et al. Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.
[43] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[44] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[45] S. Ganguli,et al. Statistical mechanics of complex neural systems and high dimensional data , 2013, 1301.7115.
[46] Burr Settles,et al. Active Learning Literature Survey , 2009 .
[47] Jason Weston,et al. Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..
[48] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[49] H. Sebastian Seung,et al. Information, Prediction, and Query by Committee , 1992, NIPS.
[50] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.
[51] E. Gardner. The space of interactions in neural network models , 1988 .
[52] M. Mézard,et al. Spin Glass Theory and Beyond , 1987 .