Database design practices for inverted files

Abstract Database research literature has proposed many procedures, both manual and automated, for database design; selection of secondary indexes for inverted file type database management systems (DBMS) has been repeatedly addressed. The empirical study reported here indicates that practical inverted file design has been relatively unaffected by this research. This paper characterizes the actual database design process used at inverted file DBMS installations along such dimension as: types of secondary keys constructed, the individuals who make index design decisions, the decisions that are changed (and when) after the initial database implementation, the factors that are considered in indexing decisions, and the literature which is used in the process. The study shows that key selection (as one example of a design decision) is addressed by ad hoc procedures and well conceived procedures are not used. Further, the results indicate that database design is dominated by users and systems analysts, indexes are frequently changed and a wide range of database performance and convenience factors are influential in practice. The paper concludes with some recommendations for database design support tools.

[1]  Mario Schkolnick,et al.  The Optimal Selection of Secondary Indices for Files , 1975, Inf. Syst..

[2]  Second Edition,et al.  Statistical Package for the Social Sciences , 1970 .

[3]  Vincent Y. Lum,et al.  An optimization problem on the selection of secondary keys , 1971, ACM '71.

[4]  Jeffrey A. Hoffer A survey of primary and secondary keys through a case study , 1979, Inf. Manag..

[5]  P. Bruce Berra,et al.  Minimum cost selection of secondary indexes for formatted files , 1977, TODS.

[6]  Alfonso F. Cardenas Analysis and performance of inverted data base structures , 1975, CACM.

[7]  Herbert A. Simon,et al.  The new science of management decision , 1960 .

[8]  Rob Gerritsen,et al.  A Data Base Design Decision Support System , 1977, VLDB.

[9]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[10]  James Martin,et al.  Computer Data-Base Organization , 1975 .

[11]  Ronald Fagin,et al.  Multivalued dependencies and a new normal form for relational databases , 1977, TODS.

[12]  S. Bing Yao An attribute based model for database access cost analysis , 1977, TODS.

[13]  Stewart A. Schuster,et al.  Query execution and index selection for relational data bases , 1975, VLDB '75.

[14]  Jeffrey A. Hoffer Data Base Management, no.9 : Methods for Primary and Secondary Key Selection , 1980 .

[15]  John J. Donovan,et al.  Databsse system approach the management decision support , 2015, TODS.

[16]  Keki B. Irani,et al.  Automatic data base schema design and optimization , 1975, VLDB '75.

[17]  Philip A. Bernstein,et al.  Synthesizing third normal form relations from functional dependencies , 1976, TODS.

[18]  James Martin,et al.  Principles of Data-Base Management , 1976 .

[19]  Arvola Chan,et al.  Index selection in a self-adaptive data base management system , 1976, SIGMOD '76.

[20]  George U. Hubbard,et al.  Automating logical file design , 1975, VLDB '75.

[21]  S. Bing Yao,et al.  Selection of file organization using an analytic model , 1975, VLDB '75.