Implications of certain assumptions in database performance evauation

The assumptions of uniformity and independence of attribute values in a file, uniformity of queries, constant number of records per block, and random placement of qualifying records among the blocks of a file are frequently used in database performance evaluation studies. In this paper we show that these assumptions often result in predicting only an upper bound of the expected system cost. We then discuss the implications of nonrandom placement, nonuniformity, and dependencies of attribute values on database design and database performance evaluation.

[1]  Stavros Christodoulakis,et al.  Estimating block transfers and join sizes , 1983, SIGMOD '83.

[2]  P. Bruce Berra,et al.  Minimum cost selection of secondary indexes for formatted files , 1977, TODS.

[3]  Gio Wiederhold,et al.  Database Design , 1977 .

[4]  Stavros Christodoulakis,et al.  Estimating selectivities in data bases , 1982 .

[5]  Mario Schkolnick A Survey of Physical Database Design Methodology and Techniques , 1978, VLDB.

[6]  Edward M. Reingold,et al.  Binary search trees of bounded balance , 1972, SIAM J. Comput..

[7]  Toby J. Teorey,et al.  Design of Database Structures , 1982 .

[8]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[9]  Mario Schkolnick,et al.  The Optimal Selection of Secondary Indices for Files , 1975, Inf. Syst..

[10]  Won Kim,et al.  Performance of the System R Access Path Selection Mechanism , 1980, IFIP Congress.

[11]  Eugene Wong,et al.  Query processing in sdd-i: a system for distributed databases , 1979 .

[12]  Alfonso F. Cardenas Analysis and performance of inverted data base structures , 1975, CACM.

[13]  S. Bing Yao Optimal distributed query processing , 1980 .

[14]  Robert Demolombe,et al.  Estimation of the Number of Tuples Satisfying a Query Expressed in Predicate Calculus Language , 1980, VLDB.

[15]  Christos Faloutsos,et al.  Design Considerations for a Message File Server , 1984, IEEE Transactions on Software Engineering.

[16]  S. B. Yao,et al.  Approximating block accesses in database organizations , 1977, CACM.

[17]  Chak-Kuen Wong,et al.  A majorization theorem for the number of distinct outcomes in n independent trials , 1973, Discrete Mathematics.

[18]  Vincent Y. Lum,et al.  A cost oriented algorithm for data set allocation in storage hierarchies , 1975, Commun. ACM.

[19]  Kenneth C. Sevcik,et al.  Performance evaluation of a relational associative processor , 1976, SIGF.

[20]  Kenneth C. Sevcik Data Base System Performance Prediction Using an Analytical Model (Invited Paper) , 1981, VLDB.

[21]  Michael Hammer,et al.  A heuristic approach to attribute partitioning , 1979, SIGMOD '79.

[22]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[23]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[24]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[25]  Alfred V. Aho,et al.  Optimal partial-match retrieval when fields are independently specified , 1979, ACM Trans. Database Syst..

[26]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[27]  Stavros Christodoulakis,et al.  Estimating record selectivities , 1983, Inf. Syst..

[28]  James B. Rothnie,et al.  Attribute based file organization in a paged memory environment , 1974, CACM.

[29]  Chak-Kuen Wong,et al.  Minimizing Expected Head Movement in One-Dimensional and Two-Dimensional Mass Storage Systems , 1980, CSUR.

[30]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[31]  Stavros Christodoulakis,et al.  Message files , 1982, TOIS.

[32]  Stavros Christodoulakis,et al.  Estimating Block Selectivities , 1984, Inf. Syst..

[33]  Gerard Salton,et al.  Dynamic information and library processing , 1975 .

[34]  E. T. Jaynes,et al.  Where do we Stand on Maximum Entropy , 1979 .