A Transformational Approach to Database System Implementation

The compilation of data intensive application programs involving persistent data into efficient implementations needs to consider multiple execution schemes. A clean separation of an application specification and its implementation can increase the number of implementation choices for a single specification. While current database systems offer data independence, their ability to capture complex specifications and computations is very limited. This affects the application performance as it forces incompatible specifications to be shaped into forms that can be captured by the few primitives these languages support. Most programming languages on the other hand offer satisfactory computational power and a large variety of data objects to choose from but support limited persistence. Here concrete implementations of abstract expressions arc expanded hierarchically through layers of abstraction rather than generated from alternatives. This dissertation bridges these two approaches by giving the database designer the ability to specify how the abstract objects defined in a program are to be mapped into the storage structures provided by the database, leaving the translation and optimization of the abstract operations to the compiler. We have developed a formal model that describes this process, called the type transformation model, based on Darlington's work on transformational programming. The type transformation model provides a method of translating any abstract operation that manipulates abstract objects into an operation that manipulates concrete objects. The translation is based on the dependencies between the abstract and concrete objects, stated by an implementation designer. We present a new algebra that facilitates this task, called the uniform traversal combinator algebra. Database queries on bulk data types, such as on lists and sets, are captured as well-formed recursive functions, called traversal combinators, whose definition is derived from the inductive properties of the structure being traversed. One important contribution of our approach is the treatment of the class of traversal combinators resulting from restricting their input functions to be themselves traversal combinators. This introduces a very disciplined and uniform treatment of programs where any program is a traversal or some other very simple primitive. This uniformity simplifies the query translation and optimization process and facilitates the verification of the resulting translation.