Database hash-join algorithms on multithreaded computer architectures

As the performance gap between main memory and modern processors widens, database algorithms must be adapted to be "architecture-aware" for optimal performance. We address this issue using the computation of hash join, one of the most important operations in database query processing, to study the impact of simultaneous multithreading (SMT) and main-memory latency (cache misses) on performance.Prior work [8] has studied cache misses on a simulation based on the Compaq ES40. Our results are obtained by measuring the performance of actual hardware (Intel Pentium and Xeon, and AMD Opteron) first for the single-threaded version of the hash-join algorithm used in the prior work and a new version designed for multiple threads.We found that hardware prefetching from main-memory data into CPU cache as implemented in the architectures we tested significantly reduces the real-world benefit of software prefetching (contrary to prior work on simulated systems). We found that SMT achieved significant speedup for our thread-aware hash join algorithm when compared with a single-threaded execution on the same single processor. Software prefetching also proved beneficial in this environment.

[1]  Henry F. Korth,et al.  Multithreaded architectures and the sort benchmark , 2005, DaMoN '05.

[2]  Philippe Roussel,et al.  The microarchitecture of the intel pentium 4 processor on 90nm technology , 2004 .

[3]  S. Sudarshan,et al.  Database System Concepts, 4th Edition , 2001 .

[4]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[5]  Todd C. Mowry,et al.  Improving index performance through prefetching , 2001, SIGMOD '01.

[6]  William J. Dally,et al.  Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[7]  Jeffrey F. Naughton,et al.  Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.

[8]  James R. Goodman,et al.  Billion-transistor architectures: there and back again , 2004, Computer.

[9]  Kenneth A. Ross,et al.  Improving Database Performance on Simultaneous Multithreading Processors , 2005, VLDB.

[10]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[11]  Gary Valentin,et al.  Fractal prefetching B+-Trees: optimizing both cache and disk performance , 2002, SIGMOD '02.

[12]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[13]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[14]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[15]  D. Marr,et al.  Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .

[16]  Susan J. Eggers,et al.  Improving server software support for simultaneous multithreaded processors , 2003, PPoPP '03.

[17]  Anastasia Ailamaki,et al.  Improving hash join performance through prefetching , 2004, Proceedings. 20th International Conference on Data Engineering.

[18]  Kenneth A. Ross,et al.  Implementing database operations using SIMD instructions , 2002, SIGMOD '02.

[19]  Susan J. Eggers,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, ISCA.

[20]  Abraham Silberschatz,et al.  Database Systems Concepts , 1997 .

[21]  David J. Sager,et al.  The microarchitecture of the Pentium 4 processor , 2001 .

[22]  D. Burger,et al.  Billion-Transistor Architectures , 1997, Computer.

[23]  Dean M. Tullsen,et al.  Simultaneous multithreading: a platform for next-generation processors , 1997, IEEE Micro.

[24]  Balaram Sinharoy,et al.  IBM Power5 chip: a dual-core multithreaded processor , 2004, IEEE Micro.

[25]  Martin L. Kersten,et al.  Generic Database Cost Models for Hierarchical Memory Systems , 2002, VLDB.