The DWS (Data Warehouse Striping) technique is a round-robin data partitioning approach especially designed for distributed data warehousing environments. In DWS the fact tables are distributed by an arbitrary number of low-cost computers and the queries are executed in parallel by all the computers, guarantying a nearly optimal speed up and scale up. However, the use of a large number of inexpensive nodes increases the risk of having node failures that impair the computation of queries. This paper proposes an approach that provides Data Warehouse Striping with the capability of answering to queries even in the presence of node failures. This approach is based on the selective replication of data over the cluster nodes, which guarantees full availability when one or more nodes fail. The proposal was evaluated using the newly TPCDS benchmark and the results show that the approach is quite effective.
[1]
Jorge Bernardino,et al.
Experimental evaluation of a new distributed partitioning technique for data warehouses
,
2001,
Proceedings 2001 International Database Engineering and Applications Symposium.
[2]
Jorge Bernardino,et al.
A New Technique to Speedup Queries in Data Warehousing
,
2000,
ADBIS-DASFAA Symposium.
[3]
Ricardo Jiménez-Peris,et al.
Middleware based data replication providing snapshot isolation
,
2005,
SIGMOD '05.
[4]
Ralph Kimball,et al.
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
,
1996
.
[5]
Gustavo Alonso,et al.
Scalable Replication in Database Clusters
,
2000,
DISC.