论文信息 - Estimating record selectivities

Estimating record selectivities

Abstract In this paper we examine the problem of modelling data base contents and user requests. This modelling is necessary in analytic data base performance evaluation studies in order to estimate the number of records of a file that have to be retrieved in response to user(s) requests. The cpu, io, and telecommunication costs of the system are directly or indirectly expressed in terms of these quantities. We first show that certain assumptions-used for modelling data base contents, data placement on devices and user requests often are not satisfied in actual data base environments. Thereafter we provide more detailed modelling techniques based on a multivariate statistical model, and we demonstrate their use in improving data base performance.

Stavros Christodoulakis | S. Christodoulakis

[1] E. F. Codd,et al. A relational model of data for large shared data banks , 1970, CACM.

[2] Peter M. Neely. Comparison of several algorithms for computation of means, standard deviations and correlation coefficients , 1966, CACM.

[3] James B. Rothnie,et al. Attribute based file organization in a paged memory environment , 1974, CACM.

[4] S. Christodoulakis. A Multivariate Statistical Model for Data Base Performance Evaluation , 1982 .

[5] Michael Hammer,et al. A heuristic approach to attribute partitioning , 1979, SIGMOD '79.

[6] Toby J. Teorey,et al. Application of an analytical model to evaluate storage structures , 1976, SIGMOD '76.

[7] Athanasios Papoulis,et al. Probability, Random Variables and Stochastic Processes , 1965 .

[8] Stavros Christodoulakis,et al. Estimating selectivities in data bases , 1982 .

[9] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10] P. Bruce Berra,et al. Minimum cost selection of secondary indexes for formatted files , 1977, TODS.

[11] Irving L. Traiger,et al. System R: relational approach to database management , 1976, TODS.