Extended Derivation Cube Based View Materialization Selection in Distributed Data Warehouse

View materialization is considered to be one of the most efficient ways to speed up decision support process and OLAP queries in data warehouse architecture. There are great varieties of research topics concerning view materialization, such as user query rewrite to transparently direct user query from base table to materialized views, or materialized views update as soon as base table changes, etc. Among most of these topics, a proper selection of views to be materialized is fundamental. While much research work has been done on view materialization selection in the central case, there are still no appropriate solutions to the problem of view selection in distributed data warehouse architecture, which is just the focus of this paper. We model the views in distributed warehouse nodes with derivation cube which is a concept widely used in central data warehouse, and make extensions in order to adapt it to distributed cases. Then, we propose a greedy-based selection algorithm under a storage cost constraint to perform selection process. Finally, a detailed experimental comparison is made to demonstrate the advantage of our solution over simply applying the central methods repeatedly on each warehouse nodes.

[1]  Jeffrey F. Naughton,et al.  Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[2]  Ashraf Elnagar,et al.  Incremental Materialization of Object-Oriented Views , 1999, Data Knowl. Eng..

[3]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize Under a Maintenance Cost Constraint , 1999, ICDT.

[4]  Wolfgang Lehner,et al.  Set-Derivability of Multidimensional Aggregates , 1999, DaWaK.

[5]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[6]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[7]  Peter Scheuermann,et al.  WATCHMAN : A Data Warehouse Intelligent Cache Manager , 1996, VLDB.

[8]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[9]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[10]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[11]  Jeffrey D. Ullman,et al.  Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[12]  Howard J. Karloff,et al.  On the complexity of the view-selection problem , 1999, PODS '99.

[13]  Rada Chirkova,et al.  The view-selection problem has an exponential-time lower bound for conjunctive queries and views , 2002, PODS '02.

[14]  Timos K. Sellis,et al.  Data Warehouse Configuration , 1997, VLDB.

[15]  Rada Chirkova,et al.  A formal perspective on the view selection problem , 2002, The VLDB Journal.

[16]  Jian Yang,et al.  Algorithms for Materialized View Design in Data Warehousing Environment , 1997, VLDB.

[17]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[18]  Inderpal Singh Mumick,et al.  Selection of views to materialize in a data warehouse , 1997, IEEE Transactions on Knowledge and Data Engineering.

[19]  Nick Roussopoulos,et al.  DynaMat: a dynamic view management system for data warehouses , 1999, SIGMOD '99.