Semantic Parsing with Syntax- and Table-Aware SQL Generation

We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results are incorrect or not executable due to the mismatch between question words and table contents. Our approach addresses this problem by considering the structure of table and the syntax of SQL language. The quality of the generated SQL query is significantly improved through (1) learning to replicate content from column names, cells or SQL keywords; and (2) improving the generation of WHERE clause by leveraging the column-cell relation. Experiments are conducted on WikiSQL, a recently released dataset with the largest question-SQL pairs. Our approach significantly improves the state-of-the-art execution accuracy from 69.0% to 74.4%.

[1]  Martín Abadi,et al.  Learning a Natural Language Interface with Neural Programmer , 2016, ICLR.

[2]  Traian Rebedea,et al.  Dataset for a Neural Natural Language Interface for Databases (NNLIDB) , 2017, IJCNLP.

[3]  Yoav Artzi,et al.  A Corpus of Natural Language for Visual Reasoning , 2017, ACL.

[4]  Trevor Darrell,et al.  Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Chen Liang,et al.  Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.

[6]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[7]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[8]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[9]  Percy Liang,et al.  Learning executable semantic parsers for natural language understanding , 2016, Commun. ACM.

[10]  Xi Victoria Lin Program Synthesis from Natural Language Using Recurrent Neural Networks , 2017 .

[11]  Dan Klein,et al.  Learning Dependency-Based Compositional Semantics , 2011, CL.

[12]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines , 2015, ArXiv.

[13]  Dawn Xiaodong Song,et al.  SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning , 2017, ArXiv.

[14]  Tong Guo,et al.  Bidirectional Attention for SQL Generation , 2017, 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).

[15]  Ming-Wei Chang,et al.  Search-based Neural Structured Learning for Sequential Question Answering , 2017, ACL.

[16]  Alessandro Moschitti,et al.  Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked , 2012, COLING.

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[19]  Jonathan Berant,et al.  Weakly Supervised Semantic Parsing with Abstract Examples , 2017, ACL.

[20]  Zhengdong Lu,et al.  Neural Enquirer: Learning to Query Tables in Natural Language , 2016, IEEE Data Eng. Bull..

[21]  Percy Liang,et al.  Data Recombination for Neural Semantic Parsing , 2016, ACL.

[22]  Xiaocheng Feng,et al.  Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer , 2018, IJCAI.

[23]  Luke S. Zettlemoyer,et al.  Online Learning of Relaxed CCG Grammars for Parsing to Logical Form , 2007, EMNLP.

[24]  Percy Liang,et al.  From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood , 2017, ACL.

[25]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[26]  Yoshimasa Tsuruoka,et al.  A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.

[27]  Fei Li,et al.  Constructing an Interactive Natural Language Interface for Relational Databases , 2014, Proc. VLDB Endow..

[28]  Hao Ma,et al.  Table Cell Search for Question Answering , 2016, WWW.

[29]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[30]  Oren Etzioni,et al.  Towards a theory of natural language interfaces to databases , 2003, IUI '03.

[31]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[32]  Luke S. Zettlemoyer,et al.  Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.

[33]  Yuchen Zhang,et al.  Macro Grammars and Holistic Triggering for Efficient Semantic Parsing , 2017, EMNLP.

[34]  Hong Sun,et al.  A Hybrid Neural Model for Type Classification of Entity Mentions , 2015, IJCAI.

[35]  Percy Liang,et al.  Compositional Semantic Parsing on Semi-Structured Tables , 2015, ACL.

[36]  Alvin Cheung,et al.  Learning a Neural Semantic Parser from User Feedback , 2017, ACL.

[37]  Li Fei-Fei,et al.  Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Chris Dyer,et al.  Semantic Parsing with Semi-Supervised Sequential Autoencoders , 2016, EMNLP.

[39]  Hoifung Poon,et al.  Grounded Unsupervised Semantic Parsing , 2013, ACL.

[40]  Rudolf Kadlec,et al.  Text Understanding with the Attention Sum Reader Network , 2016, ACL.

[41]  Dan Klein,et al.  Abstract Syntax Networks for Code Generation and Semantic Parsing , 2017, ACL.

[42]  Isil Dillig,et al.  Type- and Content-Driven Synthesis of SQL Queries from Natural Language , 2017, ArXiv.

[43]  Jayant Krishnamurthy,et al.  Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World , 2013, TACL.

[44]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[45]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[46]  Tao Yu,et al.  TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation , 2018, NAACL.

[47]  Claire Gardent,et al.  Sequence-based Structured Prediction for Semantic Parsing , 2016, ACL.

[48]  Raymond J. Mooney,et al.  Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus , 2007, ACL.

[49]  Jingtao Yao,et al.  Chunk-based Decoder for Neural Machine Translation , 2017, ACL.

[50]  Percy Liang,et al.  Inferring Logical Forms From Denotations , 2016, ACL.

[51]  Percy Liang,et al.  Simpler Context-Dependent Logical Forms via Model Projections , 2016, ACL.

[52]  Richard Socher,et al.  Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning , 2018, ArXiv.

[53]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[54]  Mark Steedman,et al.  Lexical Generalization in CCG Grammar Induction for Semantic Parsing , 2011, EMNLP.

[55]  Ming Zhou,et al.  Sequence-to-Dependency Neural Machine Translation , 2017, ACL.

[56]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[57]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[58]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[59]  Alexander I. Rudnicky,et al.  Expanding the Scope of the ATIS Task: The ATIS-3 Corpus , 1994, HLT.

[60]  Hang Li,et al.  Coupling Distributed and Symbolic Execution for Natural Language Queries , 2016, ICML.

[61]  Jayant Krishnamurthy,et al.  Neural Semantic Parsing with Type Constraints for Semi-Structured Tables , 2017, EMNLP.