Fine-Grained Progressive Algorithm Based on HMJ

Wide-area distribution raises significant performance problems for traditional query processing techniques as data access becomes less predictable due to link congestion, load imbalances, and temporary outages. Non-blocking joining query execution is a promising approach to coping with unpredictability in such environments because of reactively scheduling background processing. Classical non-blocking two-way joining technique based on hash-merge (HMJ), however, fail to deliver acceptable performance in such a scenario where relatively short intermittent delay exists in the gross. We have developed a fairly fine-grained hash-merge join, called HMJ-FG, which has employed a replacement selection tree, allowing many disparted segments to be active in parallel. Using the optimization implementation along with simulation obtained by Tao, we show that HMJ-FG is an effective solution for providing fast query responses to users even in the presence of the longer-term of data sources appeared as unavailability. Theory and experimental results show that our technique delivers results significantly fast under unreliable network.

[1]  Ning Jing,et al.  NSJ: an efficient non-blocking spatial join algorithm , 2006, GIS '06.

[2]  Chris Jermaine,et al.  The Sort-Merge-Shrink join , 2006, TODS.

[3]  Wee Hyong Tok,et al.  RRPJ: Result-Rate Based Progressive Relational Join , 2007, DASFAA.

[4]  Bernhard Seeger,et al.  Progressive Merge Join: A Generic and Non-blocking Sort-based Join Algorithm , 2002, VLDB.

[5]  Manolis Koubarakis,et al.  Distributed Evaluation of Continuous Equi-join Queries over Large Structured Overlay Networks , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[7]  Bernhard Seeger,et al.  On producing join results early , 2003, PODS '03.

[8]  Yufei Tao,et al.  RPJ: producing fast join results on streams through rate-based optimization , 2005, SIGMOD '05.

[9]  Michael J. Franklin,et al.  XJoin: Getting Fast Answers From Slow and Bursty Networks , 1999 .

[10]  Jeffrey Scott Vitter,et al.  Modeling and optimizing I/O throughput of multiple disks on a bus , 1999, SIGMETRICS '99.

[11]  Jeffrey F. Naughton,et al.  Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources , 2003, VLDB.

[12]  Wee Hyong Tok,et al.  A stratified approach to progressive approximate joins , 2008, EDBT '08.

[13]  Walid G. Aref,et al.  Hash-merge join: a non-blocking join algorithm for producing fast and early join results , 2004, Proceedings. 20th International Conference on Data Engineering.