Pipelined hash-join on multithreaded architectures

Multi-core and multithreaded processors present both opportunities and challenges in the design of database query processing algorithms. Previous work has shown the potential for performance gains, but also that, in adverse circumstances, multithreading can actually reduce performance. This paper examines the performance of a pipeline of hash-join operations when executing on multithreaded and multicore processors. We examine the optimal number of threads to execute and the partitioning of the workload across those threads. We then describe a buffer-management scheme that minimizes cache conflicts among the threads. Additionally we compare the performance of full materialization of the output at each stage in the pipeline versus passing pointers between stages.

[1]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[2]  Daniel William Towner,et al.  The 'uniform heterogeneous multi-threaded' processor architecture , 2001 .

[3]  Dinesh Manocha,et al.  GPUTeraSort: high performance graphics co-processor sorting for large database management , 2006, SIGMOD Conference.

[4]  Dean M. Tullsen,et al.  Simultaneous multithreading: a platform for next-generation processors , 1997, IEEE Micro.

[5]  W. Daniel Hillis,et al.  Data parallel algorithms , 1986, CACM.

[6]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[7]  Henry F. Korth,et al.  Database hash-join algorithms on multithreaded computer architectures , 2006, CF '06.

[8]  Philip C. Garcia OPTIMIZING DATABASE ALGORITHMS FOR MODERN COMPUTER ARCHITECTURES , 2005 .

[9]  Hidehiko Tanaka,et al.  Application of hash to data base machine and its architecture , 1983, New Generation Computing.

[10]  F. Mueller Pthreads Library Interface , 1993 .

[11]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[12]  Jeffrey F. Naughton,et al.  Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.

[13]  Anastasia Ailamaki,et al.  Improving hash join performance through prefetching , 2004, Proceedings. 20th International Conference on Data Engineering.

[14]  Goetz Graefe,et al.  Encapsulation of parallelism in the Volcano query processing system , 1990, SIGMOD '90.

[15]  James R. Goodman,et al.  Billion-transistor architectures: there and back again , 2004, Computer.

[16]  Kenneth A. Ross,et al.  Improving Database Performance on Simultaneous Multithreading Processors , 2005, VLDB.

[17]  D. Burger,et al.  Billion-Transistor Architectures , 1997, Computer.

[18]  D. Marr,et al.  Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .

[19]  Balaram Sinharoy,et al.  IBM Power5 chip: a dual-core multithreaded processor , 2004, IEEE Micro.