Dynamic maintenance of multidimensional range data partitioning for parallel data processing

Star schema has been a typical model for both online transaction processing in traditional databases and online analytical processing in large data warehouses. In the star schema, the dominant volumes of data are stored in the relationship table in terms of databases or the fact table in terms of data warehouses. Sometimes this relationship or fact table is called multidimensional table, cube, or data set. In this paper, we present a parallel method to partition the fact table in terms of multidimensional space for parallel star query processing. Also a dynamic approach to maintain load balance among all the processors is given in terms of a set of heuristics for the cases when the fact table undergoes frequent updates such as insertions/deletions. The multidimensionally partitioned data sets in the fact table are stored as leaf nodes in a multidimensional range tree, and each data set stored in the leaf node is mapped into each processor for parallel data partitioning and star query processing. As far as load balance is concerned in each of processors, we try to maintain the distribution of data volumes as uniform as possible by the set of heuristics for the star query processing in OLAP.

[1]  Hongjun Lu,et al.  Query Processing in Parallel Relational Database Systems , 1994 .

[2]  Luc Bouganim,et al.  Dynamic Load Balancing in Hierarchical Parallel Database Systems , 1996, VLDB.

[3]  Peter Widmayer,et al.  Distributing a search tree among a growing number of processors , 1994, SIGMOD '94.

[4]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[5]  Antonio Polo Márquez,et al.  Multi-dimensional partitioning for massively parallel database machines , 1995, Proceedings Euromicro Workshop on Parallel and Distributed Processing.

[6]  David J. DeWitt,et al.  MAGIC: A Multiattribute Declustering Mechanism for Multiprocessor Database Machines , 1994, IEEE Trans. Parallel Distributed Syst..

[7]  Masao Sakauchi,et al.  A Balanced Hierarchical Data Structure for Multidimensional Data with Highly Efficient Dynamic Characteristics , 1993, IEEE Trans. Knowl. Data Eng..

[8]  Jeffrey F. Naughton,et al.  An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[9]  Michael Freeston,et al.  The BANG file: A new kind of grid file , 1987, SIGMOD '87.

[10]  Hongjun Lu,et al.  Dynamic and Load-balanced Task-Oriented Datbase Query Processing in Parallel Systems , 1992, EDBT.

[11]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[12]  Jianzhong Li,et al.  CMD : A Multidimensional Declustering Method for Parallel Database Systems 1 , 1992 .

[13]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[14]  Ravi Krishnamurthy,et al.  The Multilevel Grid File - A Dynamic Hierarchical Multidimensional File Structure , 1991, DASFAA.

[15]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[16]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[17]  Farshad Fotouhi,et al.  Dynamic Selectivity Estimation for Multidimensional Queries , 1993, FODO.

[18]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[19]  Alfred G. Dale,et al.  A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins , 1991, VLDB.

[20]  Jaideep Srivastava,et al.  CMD: A Multidimensional Declustering Method for Parallel Data Systems , 1992, VLDB.

[21]  George Colliat,et al.  OLAP, relational, and multidimensional database systems , 1996, SGMD.

[22]  Jon Louis Bentley,et al.  Multidimensional Binary Search Trees in Database Applications , 1979, IEEE Transactions on Software Engineering.

[23]  Kien A. Hua,et al.  Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning , 1995, IEEE Trans. Knowl. Data Eng..

[24]  Francis C. M. Lau,et al.  Load balancing in parallel computers - theory and practice , 1996, The Kluwer international series in engineering and computer science.