A mathematical modeling approach to the automatic selection of database designs

This paper provides an overview of a methodology developed to support systems analysts in the process of database design. The design approach is built upon an analytic model composed of (1) parametric descriptions for components of a generalized database organization, (2) costing equations which can evaluate a proposed modular database design, (3) an analyst interface which accepts an arbitrary database organization for evaluation, and (4) search procedures which automatically generate and compare thousands of alternative designs. Performance is measured as the sum of storage, retrieval, and maintenance costs and is estimated from parameters of the proposed design, the problem description and the storage environment. A virtual, record-frame view of secondary storage has been developed in which data records are added, deleted and modified with minimal effect on existing data structures. Application of the modeling approach to a realistic design problem is described, and modeling accuracy to within four percent is claimed.

[1]  Dennis G. Severance,et al.  A Practical Approach to Selecting Record Access Paths , 1977, CSUR.

[2]  Peter P. Chen,et al.  Design and Performance Tools for Data Base Systems , 1977, VLDB.

[3]  Dennis G. Severance,et al.  Selection of an efficient combination of data files for a multiuser database , 1978 .

[4]  Jay F. Nunamaker,et al.  Computer-aided analysis and design of information systems , 1976, CACM.

[5]  Dennis G. Severance,et al.  The determination of efficient record segmentations and blocking factors for shared data files , 1977, TODS.

[6]  Keki B. Irani,et al.  Evaluation and Optimization , 1977, VLDB.

[7]  Kenneth F. Siler A stochastic evaluation model for database organizations in data retrieval systems , 1976, CACM.

[8]  S. Bing Yao An attribute based model for database access cost analysis , 1977, TODS.

[9]  Gerald David Held Storage structures for relational data base management systems. , 1975 .

[10]  Christer Hulten,et al.  A Simulation Model for Performance Analysis of Large Shared Data Bases , 1977, VLDB.

[11]  Jr. Salvatore Tony March Models of storage structures and the design of database records based upon a user characterization. , 1978 .

[12]  Daniel G. Keehn,et al.  VSAM Data Set Design Parameters , 1974, IBM Syst. J..

[13]  M. W. Blasgen,et al.  Storage and Access in Relational Data Bases , 1977, IBM Syst. J..

[14]  Jair M. Babad A record and file partitioning model , 1977, CACM.

[15]  Toby J. Teorey,et al.  Application of an analytical model to evaluate storage structures , 1976, SIGMOD '76.

[16]  Alfonso F. Cardenas Analysis and performance of inverted data base structures , 1975, CACM.

[17]  S. Croucher,et al.  Surveys , 1965, Understanding Communication Research Methods.

[18]  K. Maruyama,et al.  Analysis of design alternatives for virtual memory indexes , 1977, CACM.

[19]  Ricardo Alberto Duhne-Aguayo Optimal design of a generalized file organization. , 1977 .

[20]  William L. Maxwell,et al.  Comparison of alternatives for the representation of data items values in an information system , 1973, DATB.

[21]  L. S. Schneider A relational view of the data independent accessing model , 1976, SIGMOD '76.

[22]  P. Bruce Berra,et al.  Minimum cost selection of secondary indexes for formatted files , 1977, TODS.

[23]  Mario Schkolnick,et al.  A clustering algorithm for hierarchical structures , 1977, TODS.

[24]  Rob Gerritsen,et al.  A Data Base Design Decision Support System , 1977, VLDB.

[25]  Michael E. Senko,et al.  Data Structures and Data Accessing in Data Base Systems Past, Present, Future , 1977, IBM Syst. J..

[26]  Toby J. Teorey,et al.  Application of an Analytical Model to Evaluate Storage Structures , 1976, SIGMOD Conference.

[27]  Marilyn Bohl,et al.  Introduction to IBM direct access storage devices , 1980 .