In this paper, we show detailed analysis and performance evaluation of the Dynamic Hybrid GRACE Hash Join Method (DHGH Method) when the tuple distribution in buckets is unbalanced. The conventional Hash Join Methods specify the tuple distribution in buckets statically. However it may differ from estimation since join operations are applied with selection operations. When the tuple distribution in buckets is unbalanced, the processing cost of join operation becomes more costly than the ideal case when you use Hybrid Hash Join Method (HH Method). On the other hand, when you use the DHGH Method, the destaging buckets are selected dynamically, gives the same performance as the ideal case even if the tuple distribution in buckets is unbalanced such as Zipf-like distributions. We analyze the total I/O cost of a join operation at various number of buckets. The result shows that we have to determine the number of buckets baaed on the tuple distribution in buckets rather than the size of the source relation. It is shown that we had better partition the source relation using a large number of small buckets instead of the smaller number of buckets almost filling the whole main memory adopted in the HH Method.
[1]
Yasuo Yamane.
A Hash Join Technique for Relational Database Systems
,
1985,
FODO.
[2]
Michael Stonebraker,et al.
The design and implementation of INGRES
,
1976,
TODS.
[3]
Kjell Bratbergsengen,et al.
Hashing Methods and Relational Algebra Operations
,
1984,
VLDB.
[4]
Masaya Nakayama,et al.
Hash-Partitioned Join Method Using Dynamic Destaging Strategy
,
1988,
VLDB.
[5]
Robert H. Gerber,et al.
Dataflow query processing using multiprocessor hash-partitioned algorithms (database, pipeline, parallelism)
,
1986
.
[6]
Leonard D. Shapiro,et al.
Join processing in database systems with large main memories
,
1986,
TODS.
[7]
Michael Stonebraker,et al.
Implementation techniques for main memory database systems
,
1984,
SIGMOD '84.
[8]
David J. DeWitt,et al.
Multiprocessor Hash-Based Join Algorithms
,
1985,
VLDB.
[9]
C. Turbyfill.
Comparative Benchmarking of Relational Database Systems
,
1988
.