Exploiting Versions for On-line Data Warehouse Maintenance in MOLAP Servers

A data warehouse is an integrated database whose data is collected from several data sources, and supports on-line analytical processing (OLAP). Typically, a query to the data warehouse tends to be complex and involves a large volume of data. To keep the data at the warehouse consistent with the source data, changes to the data sources should be propagated to the data warehouse periodically. Because the propagation of the changes (maintenance) is batch processing, it takes long time. Since both query transactions and maintenance transactions are long and involve large volumes of data, traditional concurrency control mechanisms such as two-phase locking are not adequate for a data warehouse environment. We propose a multi-version concurrency control mechanism suited for data warehouses which use multi-dimensional OLAP (MOLAP) servers. We call the mechanism multiversion concurrency control for data warehouses (MVCCDW). To our knowledge, our work is the first attempt to exploit versions for online data warehouse maintenance in a MOLAP environment. MVCC-DW guarantees the serializability of concurrent transactions. Transactions running under the mechanism do not block each other and do not need to place locks.

[1]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[2]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[3]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[4]  Michael Stonebraker,et al.  Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[5]  Timos K. Sellis,et al.  Review - The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles , 2000, ACM SIGMOD Digital Review.

[6]  Nimrod Megiddo,et al.  Range queries in OLAP data cubes , 1997, SIGMOD '97.

[7]  Mukesh K. Mohania,et al.  Concurrent maintenance of views using multiple versions , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[8]  Jennifer Widom,et al.  On-line warehouse view maintenance , 1997, SIGMOD '97.

[9]  Hans-Peter Kriegel,et al.  The DC-tree: a fully dynamic index structure for data warehouses , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[10]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[11]  Michael Teschke,et al.  Concurrent Warehouse Maintenance Without Compromising Session Consistency , 1998, DEXA.

[12]  Yue Zhuge,et al.  Distributed and Parallel Computing Issues in Data Warehousing (Invited Talk) , 1998 .

[13]  David J. DeWitt,et al.  Shoring up persistent applications , 1994, SIGMOD '94.

[14]  Mario A. Nascimento,et al.  Towards historical R-trees , 1998, SAC '98.

[15]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[16]  Jennifer Widom,et al.  A System Prototype for Warehouse View Maintenance , 1996, VIEWS.

[17]  Yue Zhuge,et al.  Distributed and parallel computing issues in data warehousing (abstract) , 1998, PODC '98.

[18]  Jeffrey F. Naughton,et al.  Array-based evaluation of multi-dimensional queries in object-relational database systems , 1998, Proceedings 14th International Conference on Data Engineering.