论文信息 - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unseen task without any gradient-based training by feeding a small number of training examples as part of the input. ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. Parameter-efﬁcient ﬁne-tuning (e.g. adapter modules, prompt tuning, sparse update methods, etc.) offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efﬁcient ﬁne-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs. Along the way, we introduce a new parameter-efﬁcient ﬁne-tuning method called (IA) 3 that scales activations by learned vectors, attaining stronger performance while only introducing a relatively tiny amount of new parameters. We also propose a simple recipe based on the T0 model [1] called T-Few that can be applied to new tasks without task-speciﬁc tuning or modiﬁcations. We validate the effectiveness of T-Few on completely unseen tasks by applying it to the RAFT benchmark [2], attaining super-human performance for the ﬁrst time and outperforming the state-of-the-art by 6% absolute. All of the code used in our experiments is publicly available. 1

[1] Kuntal Kumar Pal,et al. Benchmarking Generalization via In-Context Instructions on 1, 600+ Language Tasks , 2022, ArXiv.

[2] Yang Gao,et al. PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization , 2022, COLING.

[3] James L. McClelland,et al. Can language models learn from explanations in context? , 2022, EMNLP.

[4] Rabeeh Karimi Mahabadi,et al. Prompt-free and Efficient Few-shot Learning with Language Models , 2022, ACL.

[5] Serge J. Belongie,et al. Visual Prompt Tuning , 2022, ECCV.

[6] Juan Cao,et al. A Prompting-based Approach for Adversarial Example Generation and Robustness Enhancement , 2022, ArXiv.

[7] Zonghan Yang,et al. On Robust Prefix-Tuning for Text Classification , 2022, ICLR.

[8] Huan Sun,et al. Shepherd Pre-trained Language Models to Develop a Train of Thought: An Iterative Prompting Approach , 2022, ArXiv.

[9] Weizhu Chen,et al. Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models , 2022, ArXiv.

[10] Yue Zhang,et al. Do Prompts Solve NLP Tasks Using Natural Language? , 2022, ArXiv.

[11] Ed H. Chi,et al. HyperPrompt: Prompt-based Task-Conditioning of Transformers , 2022, ICML.

[12] M. Lewis,et al. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? , 2022, Conference on Empirical Methods in Natural Language Processing.

[13] Orhan Firat,et al. Using natural language prompts for machine translation , 2022, ArXiv.

[14] AdaPrompt: Adaptive Model Training for Prompt-based NLP , 2022, 2202.04824.

[15] Alexander M. Rush,et al. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts , 2022, ACL.

[16] D. Sontag,et al. Co-training Improves Prompt-based Learning for Large Language Models , 2022, ICML.

[17] Shizhe Diao,et al. Black-box Prompt Learning for Pre-trained Language Models , 2022 .

[18] Jennifer G. Dy,et al. Learning to Prompt for Continual Learning , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Yejin Choi,et al. Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts , 2021, NAACL.

[20] Timo Schick,et al. True Few-Shot Learning with Prompts—A Real-World Perspective , 2021, Transactions of the Association for Computational Linguistics.

[21] M. Lewis,et al. MetaICL: Learning to Learn In Context , 2021, NAACL.

[22] Yi Tay,et al. The Efficiency Misnomer , 2021, ICLR.

[23] Brian Lester,et al. SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer , 2021, ACL.

[24] Alexander M. Rush,et al. Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[25] G. Karypis,et al. Meta-learning via Language Model In-context Tuning , 2021, ACL.

[26] Graham Neubig,et al. Towards a Unified View of Parameter-Efficient Transfer Learning , 2021, ICLR.

[27] Minlie Huang,et al. PPT: Pre-trained Prompt Tuning for Few-shot Learning , 2021, ACL.

[28] Quoc V. Le,et al. Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.

[29] Ellie Pavlick,et al. Do Prompt-Based Models Really Understand the Meaning of Their Prompts? , 2021, NAACL.

[30] Fei Huang,et al. Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners , 2021, International Conference on Learning Representations.