A Non-Uniform Data Fragmentation Strategy for Parallel Main-Menory Database Systems

In multi-processor database systems there are processor initialization and inter-communication overheads that diverge real systems from the ideal linear behaviour as the number of processors increases. Main-memory database systems suffer more since the database processing cost is small compared to disk-based database systems and thus comparable to the processor initialization cost. The usual uniform data fragmentation strategy divides a relation into equal data partitions, leading to idleness of single processors after local query execution termination and before global termination. In this paper, we propose a new, non-uniform data fragmentation strategy that results in concurrent termination of query processing among all the processors. The proposed fragmentation strategy is analytically modeled, simulated and compared to the uniform strategy. It is proven that the non-uniform fragmentation strategy offers inherently better performance for a parallel database system than the uniform strategy. Furthermore, the non-uniform strategy scales-up perfectly till an upper limit, after which a system re-configuration is needed.

[1]  Michael Stonebraker,et al.  Introduction to the Special Issue on Database Prototype Systems , 1990, IEEE Transactions on Knowledge and Data Engineering.

[2]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[3]  David J. DeWitt,et al.  A performance analysis of alternative multi-attribute declustering strategies , 1992, SIGMOD '92.

[4]  Goetz Graefe,et al.  Volcano - An Extensible and Parallel Query Evaluation System , 1994, IEEE Trans. Knowl. Data Eng..

[5]  Norman W. Paton,et al.  Object-oriented databases - a semantic data model approach , 1992, Prentice Hall International Series in Computer Science.

[6]  Hector Garcia-Molina,et al.  Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[7]  Ioannis P. Vlahavas,et al.  PRACTIC: A Concurrent Object Data Model for a Parallel Object-Oriented Database System , 1995, Inf. Sci..

[8]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[9]  Ioannis P. Vlahavas,et al.  Hierarchical Query Execution in a Parallel Object-Oriented Database System , 1996, Parallel Comput..

[10]  John G. Hughes,et al.  Object-oriented databases , 1991, Prentice Hall International series in computer science.

[11]  David J. DeWitt,et al.  A multiuser performance analysis of alternative declustering strategies , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[12]  Paul W. P. J. Grefen,et al.  PRISMA/DB: A Parallel Main Memory Relational DBMS , 1992, IEEE Trans. Knowl. Data Eng..

[13]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[14]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[15]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[16]  Peter M. G. Apers,et al.  Parallelism in a Main-Memory DBMS: The Performance of PRISMA/DB , 1992, VLDB.

[17]  Zarka Cvetanovic,et al.  The Effects of Problem Partitioning, Allocation, and Granularity on the Performance of Multiple-Processor Systems , 1987, IEEE Transactions on Computers.

[18]  Patrick Valduriez,et al.  Overview of Parallel Architectures for Databases , 1993, Comput. J..

[19]  John S. Sobolewski,et al.  Disk allocation for Cartesian product files on multiple-disk systems , 1982, TODS.

[20]  Kyung-Chang Kim Parallelism in object-oriented query processing , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[21]  Johann Eder,et al.  PPOST: A Parallel Database in Main Memory , 1994, DEXA.

[22]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[23]  The Tandem Performance Group,et al.  A benchmark of NonStop SQL on the debit credit transaction , 1988, SIGMOD '88.