Parallel Database Systems: The Future of High Performance Database Processing 1

Parallel database machine architectures have evolved from the use of exotic hardware to a software parallel dataflow architecture based on conventional shared-nothing hardware. These new designs provide impressive speedup and scaleup when processing relational database queries. This paper reviews the techniques used by such systems, and surveys current commercial and research systems.

[1]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[2]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[3]  Michael Stonebraker,et al.  Muffin: A Distributed Database Machine , 1979, University of California at Berkeley.

[4]  David J. DeWitt,et al.  Database Machines: An Idea Whose Time Passed? A Critique of the Future of Database Machines , 1989, IWDM.

[5]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[6]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[7]  Hector Garcia-Molina,et al.  Disk striping , 1986, 1986 IEEE Second International Conference on Data Engineering.

[8]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[9]  William J. Bolosky,et al.  A UNIX Interface for Shared Memory and Memory Mapped Files Under Mach , 1987, USENIX Summer.

[10]  The Tandem Performance Group,et al.  A benchmark of NonStop SQL on the debit credit transaction , 1988, SIGMOD '88.

[11]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[12]  Dina Bitton,et al.  Disk Shadowing , 1988, VLDB.

[13]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[14]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[15]  William Alexander,et al.  Process and dataflow control in distributed data-intensive systems , 1988, SIGMOD '88.

[16]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[17]  Problems and Peculiarities of Arabic Databases , 1989, IEEE Data Eng. Bull..

[18]  Karen Ward,et al.  Dynamic query evaluation plans , 1989, SIGMOD '89.

[19]  Masaru Kitsuregawa,et al.  Evaluation of 18-stage Pipeline Hardware Sorter , 1989, IWDM.

[20]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[21]  Jim Gray,et al.  A benchmark of NonStop SQL release 2 demonstrating near-linear speedup and scaleup on large databases , 1990, SIGMETRICS '90.

[22]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[23]  David J. DeWitt,et al.  Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines , 1990, VLDB.

[24]  David J. DeWitt,et al.  A multiuser performance analysis of alternative declustering strategies , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[25]  Philip S. Yu,et al.  An effective algorithm for parallelizing sort merge joins in the presence of data skew , 1990, DPDS '90.

[26]  Masaru Kitsuregawa,et al.  Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC) , 1990, VLDB.

[27]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[28]  Goetz Graefe,et al.  Encapsulation of parallelism in the Volcano query processing system , 1990, SIGMOD '90.

[29]  Shreekant S. Thakkar,et al.  Performance of an OLTP application on symmetry multiprocessor system , 1990, ISCA '90.

[30]  Alfred G. Dale,et al.  A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins , 1991, VLDB.

[31]  Kien A. Hua,et al.  Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning , 1991, VLDB.