Encapsulation of Parallelism and Architecture-Independence in Extensible Database Query Execution

Emerging database application domains demand not only high functionality, but also high performance. To satisfy these two requirements, the Volcano query execution engine combines the efficient use of parallelism on a wide variety of computer architectures with an extensible set of query processing operators that can be nested into arbitrarily complex query evaluation plans. Volcano's novel exchange operator permits designing, developing, debugging, and tuning data manipulation operators in single-process environments but executing them in various forms of parallelism. The exchange operator shields the data manipulation operators from all parallelism issues. The design and implementation of the generalized exchange operator are examined. The authors justify their decision to support hierarchical architectures and argue that the exchange operator offers a significant advantage for development and maintenance of database query processing software. They discuss the integration of bit vector filtering into the exchange operator paradigm with only minor modifications. >

[1]  Edward Babb,et al.  Implementing a relational database by means of specialzed hardware , 1979, TODS.

[2]  David J. DeWitt,et al.  A PERFORMANCE EVALUATION OF DATABASE MACHINE ARCHITECTURES , 1981 .

[3]  David J. DeWitt,et al.  Performance Analysis of Alternative Database Machine Architectures , 1982, IEEE Transactions on Software Engineering.

[4]  David J. DeWitt,et al.  Benchmarking Database Systems A Systematic Approach , 1983, VLDB.

[5]  David J. DeWitt,et al.  Database Machines: An Idea Whose Time Passed? A Critique of the Future of Database Machines , 1989, IWDM.

[6]  Randy H. Katz,et al.  Distributing a database for parallelism , 1983, SIGMOD '83.

[7]  Philip M. Neches,et al.  The Genesis of a Database Computer , 1984, Computer.

[8]  David J. DeWitt,et al.  A methodology for database system performance evaluation , 1984, SIGMOD '84.

[9]  Philip M. Neches,et al.  Hardware Support for Advanced Data Management Systems , 1984, Computer.

[10]  Laura M. Haas,et al.  Computation and communication in R*: a distributed database manager , 1984, TOCS.

[11]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[12]  Chin-Chen Chang,et al.  The Idea of De-Clustering and its Applications , 1986, VLDB.

[13]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[14]  Robert H. Gerber,et al.  Dataflow query processing using multiprocessor hash-partitioned algorithms (database, pipeline, parallelism) , 1986 .

[15]  Michael J. Carey,et al.  Programming constructs for database system implementation in EXODUS , 1987, SIGMOD '87.

[16]  Patrick Valduriez,et al.  Parallel Execution Strategies for Declustered Databases , 1987, IWDM.

[17]  Hongjun Lu,et al.  Design and evaluation of parallel pipelined join algorithms , 1987, SIGMOD '87.

[18]  Haran Boral,et al.  Parallelism in Bubba , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[19]  Philip J. Woest,et al.  The Wisconsin multicube: a new large-scale cache-coherent multiprocessor , 1988, ISCA '88.

[20]  David J. DeWitt,et al.  A performance analysis of the gamma database machine , 1988, SIGMOD '88.

[21]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[22]  P. M. Neches The Ynet: an interconnect structure for a highly concurrent data base computer system , 1988, Proceedings., 2nd Symposium on the Frontiers of Massively Parallel Computation.

[23]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[24]  Michael Stonebraker,et al.  A performance comparison of two architectures for fast transaction processing , 1988, Proceedings. Fourth International Conference on Data Engineering.

[25]  Anupam Bhide,et al.  An Analysis of Three Transaction Processing Architectures , 1988, VLDB.

[26]  Don S. Batory,et al.  GENESIS: An Extensible Database Management System , 1988, IEEE Trans. Software Eng..

[27]  Don S. Batory,et al.  Implementation concepts for an extensible data model and data language , 1988, TODS.

[28]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[29]  Goetz Graefe,et al.  Relational division: four algorithms and their performance , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[30]  Ali R. Hurson,et al.  Parallel Architectures for Database Systems , 1989, Adv. Comput..

[31]  Karen Ward,et al.  Dynamic query evaluation plans , 1989, SIGMOD '89.

[32]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[33]  David J. DeWitt,et al.  Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines , 1990, VLDB.

[34]  Jim Gray,et al.  A benchmark of NonStop SQL release 2 demonstrating near-linear speedup and scaleup on large databases , 1990, SIGMETRICS '90.

[35]  Jim Gray,et al.  A census of Tandem system availability between 1985 and 1990 , 1990 .

[36]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[37]  Masaru Kitsuregawa,et al.  Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC) , 1990, VLDB.

[38]  Kien A. Hua,et al.  An Adaptive Data Placement Scheme for Parallel Database Computer Systems , 1990, VLDB.

[39]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[40]  Goetz Graefe,et al.  Full-Time Data Compression: An ADT for Database Performance ; CU-CS-503-90 , 1990 .

[41]  Goetz Graefe,et al.  Encapsulation of parallelism in the Volcano query processing system , 1990, SIGMOD '90.

[42]  David J. DeWitt,et al.  Chained declustering: a new availability strategy for multiprocessor database machines , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[43]  Hamid Pirahesh,et al.  Parallelism in relational data base systems: architectural issues and design approaches , 1990, DPDS '90.

[44]  Hamid Pirahesh,et al.  Starburst Mid-Flight: As the Dust Clears , 1990, IEEE Trans. Knowl. Data Eng..

[45]  David Maier,et al.  Efficient assembly for complex objects , 1991, SIGMOD '91.

[46]  Hendrik A. Goosen,et al.  Paradigm: a highly scalable shared-memory multicomputer architecture , 1991, Computer.

[47]  Goetz Graefe,et al.  Extensible Query Optimization and Parallel Execution in Volcano , 1991, Query Processing for Advanced Database Systems.

[48]  David J. DeWitt,et al.  An Evaluation of Non-Equijoin Algorithms , 1991, VLDB.

[49]  Goetz Graefe,et al.  Data compression and database performance , 1991, [Proceedings] 1991 Symposium on Applied Computing.

[50]  Wei Hong,et al.  Exploiting inter-operation parallelism in XPRS , 1992, SIGMOD '92.

[51]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[52]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[53]  Wayne Davison Parallel index building in Informix OnLine 6.0 , 1992, SIGMOD '92.

[54]  Goetz Graefe,et al.  Tuning a parallel database algorithm on a shared‐memory multiprocessor , 1992, Softw. Pract. Exp..

[55]  Honesty C. Young,et al.  A Symmetric Fragment and Replicate Algorithm for Distributed Joins , 1993, IEEE Trans. Parallel Distributed Syst..

[56]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[57]  Goetz Graefe,et al.  Algebraic Optimization of Computations over Scientific Databases , 1993, IEEE Data Eng. Bull..

[58]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[59]  Goetz Graefe,et al.  Sort versus Hash Revisited , 1994, IEEE Trans. Knowl. Data Eng..