Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning

A common problem with adopting Text-to-SQL translation in database systems is poor generalization. Specifically, when there is limited training data on new datasets, existing few-shot Text-to-SQL techniques, even with carefully designed textual prompts on pre-trained language models (PLMs), tend to be ineffective. In this paper, we present a divide-and-conquer framework to better support few-shot Text-to-SQL translation, which divides Text-to-SQL translation into two stages (or sub-tasks), such that each sub-task is simpler to be tackled. The first stage, called the structure stage, steers a PLM to generate an SQL structure (including SQL commands such as SELECT, FROM, WHERE and SQL operators such as <", ?>") with placeholders for missing identifiers. The second stage, called the content stage, guides a PLM to populate the placeholders in the generated SQL structure with concrete values (including SQL identifies such as table names, column names, and constant values). We propose a hybrid prompt strategy that combines learnable vectors and fixed vectors (i.e., word embeddings of textual prompts), such that the hybrid prompt can learn contextual information to better guide PLMs for prediction in both stages. In addition, we design keyword constrained decoding to ensure the validity of generated SQL structures, and structure guided decoding to guarantee the model to fill correct content. Extensive experiments, by comparing with ten state-of-the-art Text-to-SQL solutions at the time of writing, show that SC-Prompt significantly outperforms them in the few-shot scenario. In particular, on the widely-adopted Spider dataset, given less than 500 labeled training examples (5% of the official training set), SC-Prompt outperforms the previous SOTA methods by around 5% on accuracy.

[1]  Haoming Jiang,et al.  SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models , 2022, NAACL-HLT.

[2]  Dragomir R. Radev,et al.  UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models , 2022, EMNLP.

[3]  Harm de Vries,et al.  The Power of Prompt Tuning for Low-Resource Semantic Parsing , 2021, ACL.

[4]  Dzmitry Bahdanau,et al.  PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models , 2021, EMNLP.

[5]  Minlie Huang,et al.  PPT: Pre-trained Prompt Tuning for Few-shot Learning , 2021, ACL.

[6]  Hiroaki Hayashi,et al.  Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing , 2021, ACM Comput. Surv..

[7]  Kai Yu,et al.  LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations , 2021, ACL.

[8]  Douwe Kiela,et al.  True Few-Shot Learning with Language Models , 2021, NeurIPS.

[9]  Pengfei Zhu,et al.  Dynamic Hybrid Relation Exploration Network for Cross-Domain Context-Dependent Semantic Parsing , 2021, AAAI.

[10]  Brian Lester,et al.  The Power of Scale for Parameter-Efficient Prompt Tuning , 2021, EMNLP.

[11]  Dan Klein,et al.  Constrained Language Models Yield Few-Shot Semantic Parsers , 2021, EMNLP.

[12]  Jackie Chi Kit Cheung,et al.  Optimizing Deeper Transformers on Small Datasets , 2020, ACL.

[13]  Xiaocheng Feng,et al.  TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching , 2020, COLING.

[14]  Ming-Wei Chang,et al.  Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both? , 2020, ACL.

[15]  Jonathan Berant,et al.  SmBoP: Semi-autoregressive Bottom-up Semantic Parsing , 2020, SPNLP.

[16]  Mirella Lapata,et al.  Compositional Generalization via Semantic Tagging , 2020, EMNLP.

[17]  Dragomir R. Radev,et al.  GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing , 2020, ICLR.

[18]  Hinrich Schutze,et al.  It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners , 2020, NAACL.

[19]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[20]  Timo Schick,et al.  Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference , 2020, EACL.

[21]  Peter J. Liu,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2019, ICML.

[22]  Frank F. Xu,et al.  How Can We Know What Language Models Know? , 2019, Transactions of the Association for Computational Linguistics.

[23]  Xiaodong Liu,et al.  RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers , 2019, ACL.

[24]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[25]  Peter J. Liu,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[26]  Luyao Chen,et al.  CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases , 2019, EMNLP.

[27]  Ramesh Nallapati,et al.  Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering , 2019, EMNLP.

[28]  Omer Levy,et al.  SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.

[29]  Jacob Andreas,et al.  Good-Enough Compositional Data Augmentation , 2019, ACL.

[30]  Tao Yu,et al.  Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task , 2018, EMNLP.

[31]  Dragomir R. Radev,et al.  Improving Text-to-SQL Evaluation Methodology , 2018, ACL.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Alvin Cheung,et al.  Learning a Neural Semantic Parser from User Feedback , 2017, ACL.

[34]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[35]  Jonathan Berant,et al.  Building a Semantic Parser Overnight , 2015, ACL.

[36]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[37]  Percy Liang,et al.  Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .