Partitioning Algorithms for the Computation of Average Iceberg Queries

Iceberg queries are to compute aggregate functions over an attribute (or set of attributes) to find aggregate values above some specified threshold. It's difficult to execute these queries because the number of unique data is greater than the number of counter buckets in memory. However, previous research has the limitation that average functions were out of consideration among aggregate functions. So, in order to compute average iceberg queries efficiently we introduce the theorem to select candidates by means of partitioning, and propose POP algorithm based on it. The characteristics of this algorithm are to partition a relation logically and to postpone partitioning to use memory efficiently until all buckets are occupied with candidates. Experiments show that proposed algorithm is affected by memory size, data order, and the distribution of data set.

[1]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[2]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[3]  RamakrishnanRaghu,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999 .

[4]  R. Ng,et al.  Eecient and Eeective Clustering Methods for Spatial Data Mining , 1994 .

[5]  Stavros Christodoulakis Multimedia database management (panel session) , 1985, SIGMOD '85.

[6]  Rajeev Motwani,et al.  Computing Iceberg Queries Efficiently , 1998, VLDB.

[7]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[8]  H. Garcia-Molina,et al.  Computing Iceberg Queries E ciently , 1998 .

[9]  Alain Bouju,et al.  Client-server architecture for accessing multimedia and geographic databases within embedded systems , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[10]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[11]  Kyu-Young Whang,et al.  A linear-time probabilistic counting algorithm for database applications , 1990, TODS.

[12]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[13]  Arif Ghafoor Multimedia database management systems , 1995, CSUR.

[14]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[16]  Raghu Ramakrishnan,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999, SIGMOD '99.