Improving Text-to-SQL with Schema Dependency Learning

Text-to-SQL aims to map natural language questions to SQL queries. The sketch-based method combined with execution-guided (EG) decoding strategy has shown a strong performance on the WikiSQL benchmark. However, execution-guided decoding relies on database execution, which significantly slows down the inference process and is hence unsatisfactory for many real-world applications. In this paper, we present the Schema Dependency guided multi-task Text-to-SQL model (SDSQL) to guide the network to effectively capture the interactions between questions and schemas. The proposed model outperforms all existing methods in both the settings with or without EG. We show the schema dependency learning partially cover the benefit from EG and alleviates the need for it. SDSQL without EG significantly reduces time consumption during inference, sacrificing only a small amount of performance and provides more flexibility for downstream applications.

[1]  Jayant Krishnamurthy,et al.  Neural Semantic Parsing with Type Constraints for Semi-Structured Tables , 2017, EMNLP.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Yang Zhang,et al.  Mention Extraction and Linking for SQL Query Generation , 2020, EMNLP.

[4]  Timothy Dozat,et al.  Simpler but More Accurate Semantic Dependency Parsing , 2018, ACL.

[5]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[8]  Tao Yu,et al.  SParC: Cross-Domain Semantic Parsing in Context , 2019, ACL.

[9]  Raymond J. Mooney,et al.  Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus , 2007, ACL.

[10]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Tao Yu,et al.  TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation , 2018, NAACL.

[12]  Ming Zhou,et al.  Semantic Parsing with Syntax- and Table-Aware SQL Generation , 2018, ACL.

[13]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[14]  Yan Gao,et al.  Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation , 2019, ACL.

[15]  Xiaodong Liu,et al.  RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers , 2020, ACL.

[16]  Tao Yu,et al.  Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task , 2018, EMNLP.

[17]  Quan Liu,et al.  Program Enhanced Fact Verification with Verbalization and Graph Attention Network , 2020, EMNLP.

[18]  Alvin Cheung,et al.  Learning a Neural Semantic Parser from User Feedback , 2017, ACL.

[19]  Wenhu Chen,et al.  TabFact: A Large-scale Dataset for Table-based Fact Verification , 2019, ICLR.

[20]  NAVID YAGHMAZADEH,et al.  SQLizer: query synthesis from natural language , 2017, Proc. ACM Program. Lang..

[21]  Mirella Lapata,et al.  Coarse-to-Fine Decoding for Neural Semantic Parsing , 2018, ACL.

[22]  Rishabh Singh,et al.  Robust Text-to-SQL Generation with Execution-Guided Decoding , 2018, 1807.03100.

[23]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[24]  Souvik Kundu,et al.  Hybrid Ranking Network for Text-to-SQL , 2020, ArXiv.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[27]  Luyao Chen,et al.  CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases , 2019, EMNLP.

[28]  Seunghyun Park,et al.  A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization , 2019, ArXiv.

[29]  Fei Li,et al.  Constructing an Interactive Natural Language Interface for Relational Databases , 2014, Proc. VLDB Endow..

[30]  Thomas Muller,et al.  TaPas: Weakly Supervised Table Parsing via Pre-training , 2020, ACL.

[31]  Luke S. Zettlemoyer,et al.  Online Learning of Relaxed CCG Grammars for Parsing to Logical Form , 2007, EMNLP.

[32]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[33]  Kaushik Chakrabarti,et al.  X-SQL: reinforce schema representation with context , 2019, ArXiv.

[34]  Bei Chen,et al.  RECPARSER: A Recursive Semantic Parsing Framework for Text-to-SQL Task , 2020, IJCAI.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Richard Socher,et al.  Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning , 2018, ArXiv.

[37]  Dong Ryeol Shin,et al.  RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases , 2020, CL.

[38]  Dawn Xiaodong Song,et al.  SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning , 2017, ArXiv.