Role-Based Semantics for Conceptual-Level Queries

We are developing a system known as QUICK (for QUICK is a Universal Interface with Conceptual Knowledge) which provides simplified access to database systems. It allows users to develop applications and specify ad hoc queries without requiring them to understand the underlying schema. Users present high-level queries that specify only attributes to be selected and their constraints. In turn, QUICK infers corresponding SQL queries by using a knowledge construct called a context, which is derived from underlying conceptual schema. For most queries, the context provides enough information to insulate users from the underlying schema. The context does not contain sufficient knowledge to infer the corresponding SQL query from certain classes of high-level queries. Users must specify logical attributes and some of the joins already described by the schema. This paper identifies how relationship roles specified in the conceptual schema in conjunction with new knowledge representation constructs called pseudo-schemas and supercontexts can be exploited to generate reasonable queries on complex schemas for these classes of high-level queries. Both new constructs are automatically inferred from the original conceptual database schema. ____________________________ The copyright of this paper belongs to the paper's authors. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage. Proceedings of the 5th KRDB Workshop Seattle, WA, 31-May-1998 (A. Borgida, V. Chaudhri, M. Staudt, eds.) http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-10/ 1. Background Query formulation on relational databases is a difficult task. Database programmers must nderstand the logical and conceptual database schemas to construct queries. The logical schema describes the structure of the database tables including their attributes, data types, keys, and indexes. The conceptual schema describes the semantic structure of a database including its entities, relationships, and generalizations (a.k.a. specializations). Database programmers that formulate queries to a database certainly know which attributes they want to retrieve and how to constrain the search. However, with conventional query languages, such as SQL, programmers that formulate multiple table queries must understand the semantic structure of the schema to properly formulate the joins and specify the tables. Programmers must also be familiar with the tables in which attributes are located, the table keys, and the associations described in the conceptual schema. If the database schema is discarded, which is often the cas after a database is created, the programmer must perform significant research to reconstruct the semantics of the database. Furthermore, if the underlying data model evolves, all queries of all applications must be examined to determine if they are affected by the change. The affected queries must be updated to reflect the modified database model. Ideally, the application programs and the underlying database environments should be decoupled. Programmers should expect that the database interface hides the underlying schema. Views often provide this capability to application programs. Unfortunately, views are static. When the underlying data model and database changes, all affected views must be updated manually. The use of views moves the onus of change from application programmers, who presumably know little about the underlying data environment, to database programmers, who are more familiar with the underlying data environment. Nevertheless, for a database with many