Efficient Transformation of a Natural Language Query to SQL for Urdu
暂无分享,去创建一个
Abstract It is a long term desire of the computer users to minimize the communication gap between the computer and a human. Natural Language Interfaces to Databases (NLIDBs) is one of the mechanisms to pull off this goal. In NLIDBs the question is asked in simple daily life human language and the answer is given in the same language. This research paper is about NLIDBs for Urdu language. An algorithm is developed that efficiently maps a natural language query, entered in Urdu, to an SQL (Structured Query Language) statement. The algorithm has been implemented in Visual C#.NET and tested on a database containing Student Information System and Employee Information System. The program correctly maps 85% natural language queries. 1. Introduction Natural Language Interfaces is a hot area of research since long. Asking questions from a database in natural language is a user friendly way of searching databases rather than writing and posing a question in the restricted pattern of SQL syntax. Although the nature of questions and vocabulary for a particular natural language interface is limited in some way but the user is more comfortable in writing questions in natural fashion instead of learning the keywords and syntax of the SQL. The success of designing Natural Language Interfaces to Databases (NLIDBs) are partly because of the real world payback of the field and partly because Natural Language Processing (NLP) works well in a particular database domain [1]. A number of researchers have developed different NLIDBs. Most of the early systems are based on pattern matching [2]. Lunar was a natural language based query system that answered questions about rock samples brought back from the moon [1]. This system was able to answer 90% of the questions in its domain when posed by untrained people [2]. LADDER was the first semantic grammar-based system, interfacing a database with information on US Navy ships [2]. Semantic grammars are now widely used in most NLP systems [1]. A semantic grammar is a formal definition of a language that uses concepts from a particular domain of discourse to specify acceptable expressions in that language [3]. A large part of the research in the middle of eighties was devoted to portability issues [1]. An example of this kind of system is TEAM [4]. TEAM was the result of a four years project and the core endeavor behind it was to design a portable NLIDB instead of the one that is domain specific. The design decisions incorporated in TEAM were generally applicable to a wider range of natural-language processing systems [4]. However for some of the systems, TEAM was forced to take a more limited approach. STEP is a natural language interface to relational databases developed by Michael Minock [5]. It is also based on semantic grammar and uses paraphrasing mechanism to treat the natural language query. Moreover, it is relatively trouble-free to configure for domain specific databases. Some work has also been done on the theoretical model of representing English sentences in Prolog [6]. The restraint of the work is similarity of the sentences that had been taken for examples. Semantic grammars are mostly used these days in the design of NLIDBs.. An example of such a recent work is PRECISE [7]. PRECISE is a system that guarantees the correct mapping of a natural language query to an SQL statement, if a query is semantically tractable. Moreover, the system is also proficient in resolving ambiguities that arise due to the possibilities of a value token for multiple columns. For example, a particular database could contain the value HP under a column company and also under a column platform. This work is about the transformation of a natural language query in Urdu to SQL. The proposed algorithm efficiently maps a semantically tractable natural language query to an SQL statement. The system is based on formal semantics like PRECISE, but a more efficient approach has been taken to deal with
[1] Oren Etzioni,et al. Towards a theory of natural language interfaces to databases , 2003, IUI '03.
[2] Elaine Rich. Natural-Language Interfaces , 1984, Computer.
[3] Enrico Motta,et al. Question Answering on the Real Semantic Web , 2007 .
[4] Douglas E. Appelt,et al. TEAM: An Experimental Transportable Natural-Language Interface , 1986, Fall Joint Computer Conference.