Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
暂无分享,去创建一个
[1] Ari S. Morcos,et al. Beyond neural scaling laws: beating power law scaling via data pruning , 2022, NeurIPS.
[2] Jing Yu Koh,et al. Scaling Autoregressive Models for Content-Rich Text-to-Image Generation , 2022, Trans. Mach. Learn. Res..
[3] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[4] Alexander W. Fang,et al. Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) , 2022, ICML.
[5] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.
[6] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[7] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[8] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[9] Ari S. Morcos,et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time , 2022, ICML.
[10] Chen Change Loy,et al. Conditional Prompt Learning for Vision-Language Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] David J. Fleet,et al. Kubric: A scalable dataset generator , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.
[13] Saining Xie,et al. SLIP: Self-supervision meets Language-Image Pre-training , 2021, ECCV.
[14] Prafulla Dhariwal,et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.
[15] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[16] Lu Yuan,et al. Florence: A New Foundation Model for Computer Vision , 2021, ArXiv.
[17] Karan Desai,et al. RedCaps: web-curated image-text data created by the people, for the people , 2021, NeurIPS Datasets and Benchmarks.
[18] Quoc V. Le,et al. Combined Scaling for Zero-shot Transfer Learning , 2021, Neurocomputing.
[19] Ron Mokady,et al. ClipCap: CLIP Prefix for Image Captioning , 2021, ArXiv.
[20] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Zhenguo Li,et al. FILIP: Fine-grained Interactive Language-Image Pre-Training , 2021, ICLR.
[22] Peng Gao,et al. Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling , 2021, ArXiv.
[23] Jenia Jitsev,et al. LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs , 2021, ArXiv.
[24] Raphael Gontijo Lopes,et al. No One Representation to Rule Them All: Overlapping Features of Training Methods , 2021, ICLR.
[25] Ethan Caballero,et al. Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers , 2021, ArXiv.
[26] Junjie Yan,et al. Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm , 2021, ICLR.
[27] Peng Gao,et al. CLIP-Adapter: Better Vision-Language Models with Feature Adapters , 2021, Int. J. Comput. Vis..
[28] Vinay Uday Prabhu,et al. Multimodal datasets: misogyny, pornography, and malignant stereotypes , 2021, ArXiv.
[29] Jong Wook Kim,et al. Robust fine-tuning of zero-shot models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] G. Dziugaite,et al. Deep Learning on a Data Diet: Finding Important Examples Early in Training , 2021, NeurIPS.
[31] Yair Carmon,et al. Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization , 2021, ICML.
[32] Behnam Neyshabur,et al. The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning , 2021, Trans. Mach. Learn. Res..
[33] Jiecao Chen,et al. WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning , 2021, SIGIR.
[34] Rishabh K. Iyer,et al. GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training , 2021, ICML.
[35] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[36] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[37] Radu Soricut,et al. Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Quoc V. Le,et al. Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.
[39] Pang Wei Koh,et al. WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2020, ICML.
[40] Alexander D'Amour,et al. Underspecification Presents Challenges for Credibility in Modern Machine Learning , 2020, J. Mach. Learn. Res..
[41] Lin Gao,et al. 3D-FUTURE: 3D Furniture Shape with TextURE , 2020, International Journal of Computer Vision.
[42] Alexander D'Amour,et al. On Robustness and Transferability of Convolutional Neural Networks , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Benjamin Recht,et al. Evaluating Machine Accuracy on ImageNet , 2020, ICML.
[44] David Lopez-Paz,et al. In Search of Lost Domain Generalization , 2020, ICLR.
[45] Benjamin Recht,et al. Measuring Robustness to Natural Distribution Shifts in Image Classification , 2020, NeurIPS.
[46] D. Song,et al. The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[47] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[48] Benjamin Recht,et al. The Effect of Natural Distribution Shift on Question Answering Models , 2020, ICML.
[49] Aaron C. Courville,et al. Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.
[50] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[51] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[52] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[53] Tatsunori B. Hashimoto,et al. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.
[54] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Balaji Lakshminarayanan,et al. Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.
[56] Jason J. Corso,et al. Unified Vision-Language Pre-Training for Image Captioning and VQA , 2019, AAAI.
[57] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.
[58] Sanjoy Dasgupta,et al. Teaching a black-box learner , 2019, ICML.
[59] Eric P. Xing,et al. Learning Robust Global Representations by Penalizing Local Predictive Power , 2019, NeurIPS.
[60] Begüm Demir,et al. Bigearthnet: A Large-Scale Benchmark Archive for Remote Sensing Image Understanding , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.
[61] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[62] Amos J. Storkey,et al. School of Informatics, University of Edinburgh , 2022 .
[63] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.
[64] Max Welling,et al. Rotation Equivariant CNNs for Digital Pathology , 2018, MICCAI.
[65] Fabio Roli,et al. Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2017, Pattern Recognit..
[66] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[67] Yang Song,et al. The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[68] Stefan Leutenegger,et al. SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth , 2016, ArXiv.
[69] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[70] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[71] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[72] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[73] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[74] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.
[75] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.
[76] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[77] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[78] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[79] Eric Bauer,et al. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.
[80] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[81] Rita Cucchiara,et al. From Show to Tell: A Survey on Image Captioning , 2021, ArXiv.
[82] Boris Katz,et al. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models , 2019, NeurIPS.
[83] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .