Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs
暂无分享,去创建一个
[1] Jianmin Wang,et al. From Big to Small: Adaptive Learning to Partial-Set Domains , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2] Hiroaki Hayashi,et al. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing , 2021, ACM Comput. Surv..
[3] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[4] Zhangjie Cao,et al. Zoo-Tuning: Adaptive Transfer from a Zoo of Models , 2021, ICML.
[5] Zhiyuan Liu,et al. Pre-Trained Models: Past, Present and Future , 2021, AI Open.
[6] A. Dosovitskiy,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[7] Gunnar Ratsch,et al. Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning , 2021, ICML.
[8] Mingsheng Long,et al. LogME: Practical Assessment of Pre-trained Models for Transfer Learning , 2021, ICML.
[9] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[10] Yingli Tian,et al. Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Behnam Neyshabur,et al. What is being transferred in transfer learning? , 2020, NeurIPS.
[13] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[14] Chen Sun,et al. What makes for good views for contrastive learning , 2020, NeurIPS.
[15] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[16] Xipeng Qiu,et al. Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.
[17] Kaiming He,et al. Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.
[18] Tal Hassner,et al. LEEP: A New Measure to Evaluate Transferability of Learned Representations , 2020, ICML.
[19] Stefano Soatto,et al. Rethinking the Hyperparameters for Fine-tuning , 2020, ICLR.
[20] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[21] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Model Fusion via Optimal Transport , 2019, NeurIPS.
[23] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[24] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[25] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.
[26] J. Leskovec,et al. Strategies for Pre-training Graph Neural Networks , 2019, ICLR.
[27] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[28] Mingsheng Long,et al. Stochastic Normalization , 2020, NeurIPS.
[29] Mingsheng Long,et al. Co-Tuning for Transfer Learning , 2020, NeurIPS.
[30] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[31] André Susano Pinto,et al. A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark , 2019, 1910.04867.
[32] Tal Hassner,et al. Transferability and Hardness of Supervised Classification Tasks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[33] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[34] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[35] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[36] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[39] Xinyang Chen,et al. Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning , 2019, NeurIPS.
[40] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.
[41] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[43] Xuhong Li,et al. Explicit Inductive Bias for Transfer Learning with Convolutional Networks , 2018, ICML.
[44] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[45] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[46] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.
[47] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[48] Jean Daunizeau,et al. Semi-analytical approximations to statistical moments of sigmoid and softmax mappings of normal variables , 2017, 1703.00091.
[49] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[50] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Jinpeng Huai,et al. On Hyper-Parameter Estimation In Empirical Bayes: A Revisit of The MacKay Algorithm , 2016, UAI.
[52] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[53] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[54] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[57] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.
[58] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[59] Michael Habeck,et al. Bayesian evidence and model selection , 2014, Digit. Signal Process..
[60] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.
[61] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[63] Sebastiano Vigna,et al. A Weighted Correlation Index for Rankings with Ties , 2014, WWW.
[64] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[65] Seung Woo Lee,et al. Birdsnap: Large-Scale Fine-Grained Visual Categorization of Birds , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[66] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[67] Iasonas Kokkinos,et al. Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[68] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[69] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[70] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[71] Subhransu Maji,et al. Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.
[72] Jonathan Krause,et al. Collecting a Large-scale Dataset of Fine-grained Cars , 2013 .
[73] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[74] C. V. Jawahar,et al. Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[75] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[76] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[77] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[78] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[79] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[80] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[81] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[82] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[83] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[84] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.
[85] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.
[86] Ronald Fagin,et al. Comparing top k lists , 2003, SODA '03.
[87] Shai Ben-David,et al. Exploiting Task Relatedness for Mulitple Task Learning , 2003, COLT.
[88] Wei Tang,et al. Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..
[89] Sebastian Thrun,et al. Learning to Learn: Introduction and Overview , 1998, Learning to Learn.
[90] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[91] David J. C. MacKay,et al. Bayesian Interpolation , 1992, Neural Computation.
[92] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[93] Stephen F. Gull,et al. Developments in Maximum Entropy Data Analysis , 1989 .
[94] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[95] M. Kendall. A NEW MEASURE OF RANK CORRELATION , 1938 .