Block Access Estimation for Clustered Data

A method is proposed for dealing with nonuniform data distributions in database organizations in order to estimate the expected number of blocks containing the tuples requested by a query. When tuples with equal attribute value are not uniformly distributed over the blocks of secondary memory that store the relation, a clustering effect is observed. This can be detected by means of a single parameter, the clustering factor, which can be stored in the system catalog. The method can be applied to uniform data distributions as well, since it is shown that a uniform distribution can be viewed as a particular instance of a class of clustered distributions. In this case the proposed method allows considerable reduction of the number of computational steps needed to compute the estimated result. >