论文信息 - A physical database design methodology using the property of separability

A physical database design methodology using the property of separability

A new approach to the multifile physical database design is presented. Most previous approaches towards multifile physical database design concentrated on developing cost evaluators. To accomplish the optimal physical design, however, these approaches had to rely on the designer's intuition or on the exhaustive searching method which is practically infeasible even for moderate-sized databases. In our approach we develop a theory called separability to partition the entire database design problem into collective subproblems. Straightforward heuristics are subsequently employed to incorporate the features than cannot be included in the theory. This approach is somewhat formal, deliberately avoiding excessive reliance on heuristics. We develop a design methodology for relational database systems based on the theory. First, we set up a basic design phase in accordance with a formal method that includes a large subset of practically important join methods and then, using heuristics, extend the design procedure to include other join methods as well. We show that the theory of separability can be applied to network model databases as well. In particular, we show that a large subset of practically important access structures that are available in network model database systems satisfies the conditions for separability. We propose three physical database design algorithms for relational database systems. These algorithms have been fully implemented in the Physical Database Design Optimizer (PhyDDO) in about 6000 lines of Pascal code and tested for their validation. The result shows that the solutions generated by the design algorithms do not significantly deviate from the optimal solutions. For the implementation of these design algorithms an extensive set of cost formulas for queries, update, deletion, and insertion transactions have been developed. Index selection is an important subproblem of physical database design. Index selection algorithms for relational databases are introduced and tested for their validation. The result shows these algorithms do not produce significant deviations from the optimal solutions. Finally, we introduce a closed noniterative formula for estimating the number of block accesses. This formula, an approximation of Yao's exact formula, provides significant improvements in both speed of evaluation and accuracy compared with earlier formulas developed by Yao and Cardenas.

Kyu-Young Whang