论文信息 - Search Like a Human : Neural Machine Translation for Database Search

Search Like a Human : Neural Machine Translation for Database Search

With an ever growing amount of data generated every day, the need to simplify the treatment of those data becomes a requirement. We need to develop new tools to allow non-technical users to access and filter data as easily and close to the human language as possible. In this paper, we implement the state-of-theart model for generating an SQL query from a natural language question. The model consists of three parts: a contextualized word model representation (e.g. ELMo, BERT, etc.); a sequence-to-SQL generation decoder; and an ExecutionGuided (EG) decoder to guarantee the consistency of the SQL request. We chose to experiment with a different contextualized word model representation, the recently published GPT-2 model, to test the robustness of the SOTA core model; and change additional features of the model (using Transformers). We observe that this approach, along with NL2SQL suggested by SQLova [11], generates some very promising results and scores the third place on the Salesforce WikiSQL leaderboard. Our implementation with OpenAI GPT-2 gets a 77.5% logical form accuracy and 84.4% execution accuracy on the test set (compared to 83.6% and 89.6% for SQLova, respectively). Our findings prove GPT-2 (small) to be a solid multitasking pre-model, but yet doesn’t produce results as good as BERT (small).

Geoffrey B. Boullanger | G. B. Boullanger

[1] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[2] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[3] Seunghyun Park,et al. A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization , 2019, ArXiv.

[4] Dawn Xiaodong Song,et al. SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning , 2017, ArXiv.

[5] Richard Socher,et al. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning , 2018, ArXiv.

[6] Po-Sen Huang,et al. Execution-Guided Neural Program Decoding , 2018, ArXiv.

[7] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[8] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[9] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11] Mirella Lapata,et al. Coarse-to-Fine Decoding for Neural Semantic Parsing , 2018, ACL.

[12] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.