SISYPHUS: The implementation of a chunk-based storage manager for OLAP data cubes

In this article, we present the design and implementation of SISYPHUS, a storage manager for data cubes that provides an efficient physical base for performing on-line analytical processing (OLAP) operations. OLAP poses new requirements to the physical storage layer of a database management system. Special characteristics of OLAP cubes such as multidimensionality, hierarchical structure of dimensions, data sparseness, etc., are difficult to handle with ordinary record-oriented storage managers. The SISYPHUS storage manager is based on a chunk-based data model that enables the hierarchical clustering of data with a very low storage cost. In this article we present the implementation of SISYPHUS' chunk-oriented file system as well as present the core architecture of the system and reason on various design choices and implementation solutions.

[1]  Timos K. Sellis,et al.  Processing Star Queries on Hierarchically-Clustered Fact Tables , 2002, VLDB.

[2]  Nick Roussopoulos,et al.  Cubetree: organization of and bulk incremental updates on the data cube , 1997, SIGMOD '97.

[3]  Panos Vassiliadis,et al.  ERATOSTHENES : Design and Architecture of an OLAP * System , 2001 .

[4]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[5]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[6]  Michael Stonebraker,et al.  The Implementation of Postgres , 1990, IEEE Trans. Knowl. Data Eng..

[7]  Rudolf Bayer,et al.  The Universal B-Tree for Multidimensional Indexing: general Concepts , 1997, WWCA.

[8]  Panos Vassiliadis,et al.  Modelling and Optimisation Issues for Multidimensional Databases , 2000, CAiSE.

[9]  Jeffrey F. Naughton,et al.  Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[10]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[11]  Nick Roussopoulos,et al.  An alternative storage organization for ROLAP aggregate views based on cubetrees , 1998, SIGMOD '98.

[12]  Divesh Srivastava,et al.  Answering Queries with Aggregation Using Views , 1996, VLDB.

[13]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[14]  George Colliat,et al.  OLAP, relational, and multidimensional database systems , 1996, SGMD.

[15]  Surajit Chaudhuri,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications. , 1995 .

[16]  Volker Markl,et al.  Improving OLAP performance by multidimensional hierarchical clustering , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[17]  Nick Roussopoulos,et al.  Materialized views and data warehouses , 1998, SGMD.

[18]  Michael Stonebraker,et al.  Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[19]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[20]  Inderpal Singh Mumick,et al.  The Stanford Data Warehousing Project , 1995 .

[21]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[22]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[23]  Yannis E. Ioannidis,et al.  Hierarchical Prefix Cubes for Range-Sum Queries , 1999, VLDB.

[24]  Sunita Sarawagi Indexing OLAP Data , 1997, IEEE Data Eng. Bull..

[25]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .