Search Like a Human : Neural Machine Translation for Database Search

With an ever growing amount of data generated every day, the need to simplify the treatment of those data becomes a requirement. We need to develop new tools to allow non-technical users to access and filter data as easily and close to the human language as possible. In this paper, we implement the state-of-theart model for generating an SQL query from a natural language question. The model consists of three parts: a contextualized word model representation (e.g. ELMo, BERT, etc.); a sequence-to-SQL generation decoder; and an ExecutionGuided (EG) decoder to guarantee the consistency of the SQL request. We chose to experiment with a different contextualized word model representation, the recently published GPT-2 model, to test the robustness of the SOTA core model; and change additional features of the model (using Transformers). We observe that this approach, along with NL2SQL suggested by SQLova [11], generates some very promising results and scores the third place on the Salesforce WikiSQL leaderboard. Our implementation with OpenAI GPT-2 gets a 77.5% logical form accuracy and 84.4% execution accuracy on the test set (compared to 83.6% and 89.6% for SQLova, respectively). Our findings prove GPT-2 (small) to be a solid multitasking pre-model, but yet doesn’t produce results as good as BERT (small).