Workload-Based Placement and Join Processing in Node-Partitioned Data Warehouses

Data warehouses (DW) with enormous quantities of data put major performance and scalability challenges. The Node-Partitioned Data Warehouse (NPDW) divides the DW into cheap computer nodes for scalability. Partitioning and data placement strategies are relevant to the performance of complex queries on the NPDW. In this paper we propose a partitioning placement and join processing strategy to boost the performance of costly joins in NPDW, compare alternative strategies using the performance evaluation benchmark TPC-H and draw conclusions.

[1]  Peter M G Apers,et al.  Data allocation in distributed database systems , 1988, TODS.

[2]  Clement T. Yu,et al.  Partition Strategy for Distributed Query Processing in Fast Local Networks , 1989, IEEE Trans. Software Eng..

[3]  Domenico Saccà,et al.  Database partitioning in a cluster of processors , 1983, TODS.

[4]  Chun Zhang,et al.  Automating physical database design in a parallel database , 2002, SIGMOD '02.

[5]  Kien A. Hua,et al.  An Adaptive Data Placement Scheme for Parallel Database Computer Systems , 1990, VLDB.

[6]  Dennis Shasha,et al.  Optimizing equijoin queries in distributed databases where relations are hash partitioned , 1991, TODS.

[7]  Hao Chen,et al.  An efficient algorithm for processing distributed queries using partition dependency , 2000, Proceedings Seventh International Conference on Parallel and Distributed Systems (Cat. No.PR00568).

[8]  Eugene Wong,et al.  Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[9]  Hao Chen,et al.  A Hash Partition Strategy for Distributed Query Processing , 1996, EDBT.

[10]  Hao Chen,et al.  A distributed query processing strategy using placement dependency , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[11]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[12]  Miron Livny,et al.  Multi-disk management algorithms , 1987, SIGMETRICS '87.

[13]  David J. DeWitt,et al.  Multiprocessor Hash-Based Join Algorithms , 1985, VLDB.