Join processing in database systems with large main memories

We study algorithms for computing the equijoin of two relations in a system with a standard architecture hut with large amounts of main memory. Our algorithms are especially efficient when the main memory available is a significant fraction of the size of one of the relations to he joined; but they can be applied whenever there is memory equal to approximately the square root of the size of one relation. We present a new algorithm which is a hybrid of two hash-based algorithms and which dominates the other algorithms we present, including sort-merge. Even in a virtual memory environment, the hybrid algorithm dominates all the others we study. Finally, we describe how three popular tools to increase the efficiency of joins, namely filters, Babb arrays, and semijoins, can he grafted onto any of our algorithms.

[1]  Daniel L. Slotnick Logic per Track Devices , 1970, Adv. Comput..

[2]  Dennis G. Severance,et al.  A practitioner's guide to addressing algorithms , 1976, CACM.

[3]  M. W. Blasgen,et al.  Storage and Access in Relational Data Bases , 1977, IBM Syst. J..

[4]  Edward Babb,et al.  Implementing a relational database by means of specialzed hardware , 1979, TODS.

[5]  James Richard Goodman An investigation of multiprocessor structures and algorithms for data base management , 1980 .

[6]  Michael Stonebraker,et al.  Operating system support for database management , 1981, CACM.

[7]  Eugene Wong,et al.  Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[8]  Giovanni Maria Sacco,et al.  A Mechanism for Managing the Buffer Pool in a Relational Database System Using the Hot Set Model , 1982, VLDB.

[9]  Larry Kerschberg,et al.  Query optimization in star computer networks , 1982, TODS.

[10]  David J. DeWitt,et al.  Parallel algorithms for the execution of relational database operations , 1983, TODS.

[11]  Patrick Valduriez,et al.  Join and Semijoin Algorithms for a Multiprocessor Database Machine , 1984, TODS.

[12]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[13]  Gregory Piatetsky-Shapiro,et al.  Accurate estimation of the number of tuples satisfying a condition , 1984, SIGMOD '84.

[14]  Richard J. Lipton,et al.  A Massive Memory Machine , 1984, IEEE Transactions on Computers.

[15]  Wolfgang Effelsberg,et al.  Principles of database buffer management , 1984, TODS.

[16]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[17]  David J. DeWitt,et al.  Multiprocessor Hash-Based Join Algorithms , 1985, VLDB.

[18]  Yasuo Yamane A Hash Join Technique for Relational Database Systems , 1985, FODO.