Ag + -tree: an Index Structure for Range-aggregation Queries in Data Warehouse Environments

Range-aggregate queries are popular in many applications in data warehouse environments with large business relational databases. To evaluate these efficiently, several studies on data cubes (such as the aggregate cubetree) have been carried out. In the wellknown aggregate cubetree, each entry in every node stores the aggregate values of its corresponding subtree. Therefore, range-aggregate queries can be processed without visiting the child subtree whose nodes are all fully included in the query range. However, the aggregate cubetree does not consider range queries using partial dimensions and range queries without aggregation operations. Concretely, 1) a great deal of information that is irrelevant to the queries also has to be read from the disk for partial-dimensional range queries, and 2) while it improves the performance of range queries with aggregate operations, it degrades the performance of range queries without aggregate operations. As part of our research on this problem, previously we proposed an index structure, called the Aggregatetree (denoted as Ag-tree), which does away with the above-mentioned weaknesses of the aggregate cubetree. Additionally in this paper, we make the Ag-tree more complete by sorting the entries in each of the nodes. The final index structure proposed in this study is called an Ag + -tree.

[1]  Nick Roussopoulos,et al.  Cubetree: organization of and bulk incremental updates on the data cube , 1997, SIGMOD '97.

[2]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[3]  Jiawei Han,et al.  High-Dimensional OLAP: A Minimal Cubing Approach , 2004, VLDB.

[4]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[5]  Inderpal Singh Mumick,et al.  Selection of views to materialize in a data warehouse , 1997, IEEE Transactions on Knowledge and Data Engineering.

[6]  Nimrod Megiddo,et al.  Range queries in OLAP data cubes , 1997, SIGMOD '97.

[7]  Laks V. S. Lakshmanan,et al.  QC-trees: an efficient summary structure for semantic OLAP , 2003, SIGMOD '03.

[8]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[9]  Ralph Kimball,et al.  The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses , 1996 .

[10]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[11]  Hongjun Lu,et al.  Condensed cube: an effective approach to reducing data cube size , 2002, Proceedings 18th International Conference on Data Engineering.

[12]  Jiawei Han,et al.  Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration , 2003, Very Large Data Bases Conference.

[13]  Jeffrey F. Naughton,et al.  On the Computation of Multidimensional Aggregates , 1996, VLDB.

[14]  Nick Roussopoulos,et al.  Materialized views and data warehouses , 1998, SGMD.

[15]  Inderpal Singh Mumick,et al.  Maintenance of data cubes and summary tables in a warehouse , 1997, SIGMOD '97.

[16]  Yaokai Feng,et al.  Efficient evaluation of partially-dimensional range queries in large OLAP datasets , 2011, Int. J. Data Min. Model. Manag..

[17]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[18]  Sunita Sarawagi,et al.  On computing the data cube , 1996 .

[19]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[20]  Yaokai Feng,et al.  Ag-Tree: A Novel Structure for Range Queries in Data Warehouse Environments , 2006, DASFAA.

[21]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[22]  Yaokai Feng,et al.  Batch-Incremental Nearest Neighbor Search Algorithm and Its Performance Evaluation , 2003 .

[23]  Nick Roussopoulos,et al.  An alternative storage organization for ROLAP aggregate views based on cubetrees , 1998, SIGMOD '98.

[24]  Seokjin Hong,et al.  Efficient Execution of Range-Aggregate Queries in Data Warehouse Environments , 2001, ER.