Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources

Data warehouses (DW) are built by gathering information from several information sources (ISs) and integrating it into one repository customized to users' needs. Recent work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. SWEEP proposed by Agrawal et al. AAS97] is one of the more popular solutions; even though its performance is limited due to enforcing a sequential ordering on the handling of data updates from ISs by the view maintenance module. We have overcome this limitation by developing a parallel algorithm for view maintenance, called PSWEEP, that still incorporates all beneets of SWEEP while ooering substantially improved performance. In order to perform parallel view maintenance, we solve two issues: detecting maintenance-concurrent data updates in a parallel mode, and correcting the problem that the DW commit order may not correspond to the DW update processing order due to parallel maintenance handling. By decomposing SWEEP into an architecture of modular components, we then can insert a local timestamp assignment module for detecting maintenance-concurrent data updates without requiring any global clock synchronization. We introduce the negative counter concept as a simple yet suucient solution to solve the Variant-DW-Commit problem of variant orders of committing eeects of data updates to the DW. We have proven the correctness of PSWEEP to guarantee that our strategy indeed generates the correct nal DW state. An evaluation of both SWEEP and PSWEEP is given that shows that PSWEEP has the potential of multi-fold performance improvement over SWEEP depending on the number of threads supportable in the given DW system implementation.

[1]  Jennifer Widom,et al.  View maintenance in a warehousing environment , 1995, SIGMOD '95.

[2]  Alejandro P. Buchmann,et al.  Research Issues in Data Warehousing , 1997, BTW.

[3]  D. Agrawal,et al.  E cient View Maintenance at Data Warehouses , 1997 .

[4]  H. V. Jagadish,et al.  Maintenance and Self Maintenance of Outer-Join Views , 1997, NGITS.

[5]  Sunil Samtani,et al.  Maintaining consistency in partially self-maintainable views at the data warehouse , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[6]  Mukesh K. Mohania,et al.  Incremental Maintenance of Materialized Views , 1997, DEXA.

[7]  Jian Yang,et al.  A framework for designing materialized views in data warehousing environment , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[8]  Yue Zhuge,et al.  The WHIPS prototype for data warehouse creation and maintenance , 1997, Proceedings 13th International Conference on Data Engineering.

[9]  Yue Zhuge,et al.  Distributed and parallel computing issues in data warehousing (abstract) , 1998, SPAA '98.

[10]  Yue Zhuge,et al.  The Strobe algorithms for multi-source warehouse consistency , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[11]  Kenneth A. Ross,et al.  Concurrency Control Theory for Deferred Materialized Views , 1997, ICDT.

[12]  Jennifer Widom,et al.  Making views self-maintainable for data warehousing , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[13]  Latha S. Colby,et al.  Algorithms for deferred view maintenance , 1996, SIGMOD '96.

[14]  Kenneth A. Ross,et al.  Implementing Incremental View Maintenance in Nested Data Models , 1997, DBPL.

[15]  Jennifer Widom,et al.  A System Prototype for Warehouse View Maintenance , 1996, VIEWS.