Translation of Natural Language Queries to SQL that Involve Aggregate Functions, Grouping and Subqueries for a Natural Language Interface to Databases

Currently, huge amounts of information are stored in databases (DBs). In order to facilitate access to information to all users, natural language interfaces to databases (NLIDBs) have been developed. To this end, these interfaces translate natural language queries to a DB query language. For businesses, the main application of NLIDBs is for decision making by facilitating access to information in a flexible manner. For a NLIDB to be considered complete, it must deal with queries that involve aggregate functions: COUNT, MIN, MAX, SUM and AVG. The prototype developed at the Instituto Tecnologico de Cd. Madero (ITCM) can translate queries in natural language to SQL; however, it did not have a module for dealing with aggregate functions, grouping and subqueries. In this paper a new module of this NLIDB for dealing with aggregate functions, grouping and subqueries is described, and experimental results are presented, which show that this interface has a performance (recall) better than that of C-Phrase.