The Gamma Database Machine Project

The design of the Gamma database machine and the techniques employed in its implementation are described. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the architecture to be scaled to hundreds of processors. First, all relations are horizontally partitioned across multiple disk drives, enabling relations to be scanned in parallel. Second, parallel algorithms based on hashing are used to implement the complex relational operators, such as join and aggregate functions. Third, dataflow scheduling techniques are used to coordinate multioperator queries. By using these techniques, it is possible to control the execution of very complex queries with minimal coordination. The design of the Gamma software is described and a thorough performance evaluation of the iPSC/s hypercube version of Gamma is presented. >

[1]  Robert E. Wagner,et al.  Indexing Design Considerations , 1973, IBM Syst. J..

[2]  Irving L. Traiger,et al.  System R: relational approach to database management , 1976, TODS.

[3]  D. J. De Witt,et al.  Direct—A Multiprocessor Organization for Supporting Relational Database Management Systems , 1979 .

[4]  Jim Gray,et al.  The convoy phenomenon , 1979, OPSR.

[5]  David J. DeWitt,et al.  DIRECT - A Multiprocessor Organization for Supporting Relational Database Management Systems , 1979, IEEE Trans. Computers.

[6]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[7]  James Richard Goodman An investigation of multiprocessor structures and algorithms for data base management , 1980 .

[8]  Andrea J. Borr Transaction Monitoring in ENCOMPASS: Reliable Distributed Transaction Processing , 1981, VLDB.

[9]  David J. DeWitt,et al.  Benchmarking Database Systems A Systematic Approach , 1983, VLDB.

[10]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[11]  Matthias Jarke,et al.  Query Optimization in Database Systems , 1984, CSUR.

[12]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[13]  David J. DeWitt,et al.  Design and implementation of the wisconsin storage system , 1985, Softw. Pract. Exp..

[14]  David J. DeWitt,et al.  Recovery architectures for multiprocessor database machines , 1985, SIGMOD Conference.

[15]  David J. DeWitt,et al.  Multiprocessor Hash-Based Join Algorithms , 1985, VLDB.

[16]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[17]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[18]  Michelle Y. Kim,et al.  Synchronized Disk Interleaving , 1986, IEEE Transactions on Computers.

[19]  David J. DeWitt,et al.  The Crystal Multicomputer: Design and Implementation Experience , 1987, IEEE Transactions on Software Engineering.

[20]  Miron Livny,et al.  Multi-disk management algorithms , 1987, SIGMETRICS '87.

[21]  The Tandem Performance Group,et al.  A benchmark of NonStop SQL on the debit credit transaction , 1988, SIGMOD '88.

[22]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[23]  David J. DeWitt,et al.  A performance analysis of the gamma database machine , 1988, SIGMOD '88.

[24]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[25]  Michael Stonebraker,et al.  The Design of XPRS , 1988, VLDB.

[26]  David J. DeWitt,et al.  A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.

[27]  Tom W. Keller,et al.  A comparison of high-availability media recovery techniques , 1989, SIGMOD '89.

[28]  Jim Gray,et al.  A benchmark of NonStop SQL release 2 demonstrating near-linear speedup and scaleup on large databases , 1990, SIGMETRICS '90.

[29]  David J. DeWitt,et al.  A multiuser performance analysis of alternative declustering strategies , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[30]  David J. DeWitt,et al.  Chained declustering: a new availability strategy for multiprocessor database machines , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[31]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .