Skill-Based Few-Shot Selection for In-Context Learning

In-Context learning is the paradigm that adapts large language models to downstream tasks by providing a few examples. Few-shot selection -- selecting appropriate examples for each test instance separately -- is important for in-context learning. In this paper, we propose Skill-KNN, a skill-based few-shot selection method for in-context learning. The key advantages of Skill-KNN include: (1) it addresses the problem that existing methods based on pre-trained embeddings can be easily biased by surface natural language features that are not important for the target task; (2) it does not require training or fine-tuning of any models, making it suitable for frequently expanding or changing example banks. The key insight is to optimize the inputs fed into the embedding model, rather than tuning the model itself. Technically, Skill-KNN generates the skill-based representations for each test case and candidate example by utilizing a pre-processing few-shot prompting, thus eliminating unimportant surface features. Experimental results across four cross-domain semantic parsing tasks and four backbone models show that Skill-KNN significantly outperforms existing methods.

[1]  D. Zhang,et al.  How Do In-Context Examples Affect Compositional Generalization? , 2023, ACL.

[2]  Davood Rafiei,et al.  DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction , 2023, NeurIPS.

[3]  Song-Chun Zhu,et al.  Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models , 2023, ArXiv.

[4]  Ge Li,et al.  Towards Enhancing In-Context Learning for Code Generation , 2023, ArXiv.

[5]  Xu Tan,et al.  HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace , 2023, ArXiv.

[6]  Xuming Hu,et al.  A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability , 2023, ArXiv.

[7]  Xipeng Qiu,et al.  Finding Supporting Examples for In-Context Learning , 2023, ArXiv.

[8]  Eric Wong,et al.  In-context Example Selection with Influences , 2023, ArXiv.

[9]  Cuiping Li,et al.  RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL , 2023, AAAI.

[10]  Lingpeng Kong,et al.  Compositional Exemplars for In-context Learning , 2023, ArXiv.

[11]  Luke Zettlemoyer,et al.  Toolformer: Language Models Can Teach Themselves to Use Tools , 2023, NeurIPS.

[12]  Greg Durrett,et al.  Explanation Selection Using Unlabeled Data for In-Context Learning , 2023, ArXiv.

[13]  Michihiro Yasunaga,et al.  Is ChatGPT a General-Purpose Natural Language Processing Task Solver? , 2023, EMNLP.

[14]  William Yang Wang,et al.  Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness , 2023, ICLR.

[15]  Lingpeng Kong,et al.  Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering , 2022, ACL.

[16]  Jonathan Berant,et al.  Diverse Demonstrations Improve In-context Compositional Generalization , 2022, ArXiv.

[17]  Greg Durrett,et al.  Complementary Explanations for Effective In-Context Learning , 2022, Annual Meeting of the Association for Computational Linguistics.

[18]  Yiming Zhang,et al.  Active Example Selection for In-Context Learning , 2022, EMNLP.

[19]  Hyung Won Chung,et al.  Language Models are Multilingual Chain-of-Thought Reasoners , 2022, ICLR.

[20]  Dragomir R. Radev,et al.  Binding Language Models in Symbolic Languages , 2022, ICLR.

[21]  K. McKeown,et al.  On the Relation between Sensitivity and Accuracy in In-context Learning , 2022, EMNLP.

[22]  Weizhu Chen,et al.  CodeT: Code Generation with Generated Tests , 2022, ICLR.

[23]  J. Dean,et al.  Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..

[24]  Gerard de Melo,et al.  Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models , 2022, ArXiv.

[25]  Weizhu Chen,et al.  On the Advance of Making Language Models Better Reasoners , 2022, ArXiv.

[26]  Michael Pradel,et al.  Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code , 2022, ArXiv.

[27]  I. Higgins,et al.  Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning , 2022, ICLR.

[28]  Zhouhan Lin,et al.  RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL , 2022, EMNLP.

[29]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[30]  Lisa Anne Hendricks,et al.  Training Compute-Optimal Large Language Models , 2022, ArXiv.

[31]  D. Schuurmans,et al.  Self-Consistency Improves Chain of Thought Reasoning in Language Models , 2022, ICLR.

[32]  Noah A. Smith,et al.  In-Context Learning for Few-Shot Dialogue State Tracking , 2022, EMNLP.

[33]  Dzmitry Bahdanau,et al.  Evaluating the Text-to-SQL Capabilities of Large Language Models , 2022, ArXiv.

[34]  Cherepanov,et al.  Competition-level code generation with AlphaCode , 2022, Science.

[35]  Reza Yazdani Aminabadi,et al.  Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model , 2022, ArXiv.

[36]  Sumit Gulwani,et al.  Synchromesh: Reliable code generation from pre-trained language models , 2022, ICLR.

[37]  Peter Welinder,et al.  Text and Code Embeddings by Contrastive Pre-Training , 2022, ArXiv.

[38]  Jonathan Berant,et al.  Learning To Retrieve Prompts for In-Context Learning , 2021, NAACL.

[39]  S. Riedel,et al.  Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity , 2021, ACL.

[40]  Weizhu Chen,et al.  What Makes Good In-Context Examples for GPT-3? , 2021, DEELIO.

[41]  Ellie Pavlick,et al.  Mapping Language Models to Grounded Conceptual Spaces , 2022, ICLR.

[42]  Po-Sen Huang,et al.  Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.

[43]  Dzmitry Bahdanau,et al.  PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models , 2021, EMNLP.

[44]  Wojciech Zaremba,et al.  Evaluating Large Language Models Trained on Code , 2021, ArXiv.

[45]  Matthew Richardson,et al.  KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers , 2021, ACL.

[46]  Dan Klein,et al.  Constrained Language Models Yield Few-Shot Semantic Parsers , 2021, EMNLP.

[47]  D. Klein,et al.  Calibrate Before Use: Improving Few-Shot Performance of Language Models , 2021, ICML.

[48]  Danqi Chen,et al.  Making Pre-trained Language Models Better Few-shot Learners , 2021, ACL.

[49]  Dawn Song,et al.  Measuring Massive Multitask Language Understanding , 2020, ICLR.

[50]  Richard Socher,et al.  Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing , 2020, FINDINGS.

[51]  Tal Linzen,et al.  COGS: A Compositional Generalization Challenge Based on Semantic Interpretation , 2020, EMNLP.

[52]  Sida I. Wang,et al.  Grounded Adaptation for Zero-shot Executable Semantic Parsing , 2020, EMNLP.

[53]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[54]  Xiaodong Liu,et al.  RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers , 2019, ACL.

[55]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[56]  Yan Gao,et al.  Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation , 2019, ACL.

[57]  Tao Yu,et al.  Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task , 2018, EMNLP.

[58]  Tao Yu,et al.  TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation , 2018, NAACL.

[59]  Dawn Xiaodong Song,et al.  SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning , 2017, ArXiv.

[60]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[61]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[62]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[63]  Rohit J. Kate,et al.  Learning to Transform Natural to Formal Languages , 2005, AAAI.

[64]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.