Accelerating database operators using a network processor

Database management systems (DBMSs) do not take full advantage of modern microarchitectural resources, such as wide-issue out-of-order processor pipelines. Increases in processor clock rate and instruction-level parallelism have left memory accesses as the dominant bottleneck in DBMS execution. Prior research indicates that simultaneous multithreading (SMT) can hide memory access latency from a single thread and improve throughput by increasing the number of outstanding memory accesses. Rather than expend chip area and power on out-of-order execution, as in current SMT processors, we demonstrate the effectiveness of using many simple processor cores, each with hardware support for multiple thread contexts. This paper shows an existing hardware architecture—the network processor—already fits the model for multi-threaded, multi-core execution. Using an Intel IXP2400 network processor, we evaluate the performance of three key database operations and demonstrate improvements of 1.9X to 2.5X when compared to a generalpurpose processor.

[1]  Allan Porterfield,et al.  Exploiting heterogeneous parallelism on a multithreaded multiprocessor , 1992, ICS '92.

[2]  Donald Yeung,et al.  Sparcle: an evolutionary processor design for large-scale multiprocessors , 1993, IEEE Micro.

[3]  David J. DeWitt,et al.  Shoring up persistent applications , 1994, SIGMOD '94.

[4]  Todd C. Mowry,et al.  Compiler-based prefetching for recursive data structures , 1996, ASPLOS VII.

[5]  R.E. Johnson,et al.  Evaluation of Multithreaded Uniprocessors for Commercial Application Environments , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[6]  David A. Patterson,et al.  Performance characterization of a Quad Pentium Pro SMP using OLTP workloads , 1998, ISCA.

[7]  Sarita V. Adve,et al.  Performance of database workloads on shared-memory systems with out-of-order processors , 1998, ASPLOS VIII.

[8]  Susan J. Eggers,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, ISCA.

[9]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[10]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[11]  Gurindar S. Sohi,et al.  Effective jump-pointer prefetching for linked data structures , 1999, ISCA.

[12]  Martin L. Kersten,et al.  What Happens During a Join? Dissecting CPU and Memory Optimization Effects , 2000, VLDB.

[13]  Per Stenström,et al.  A prefetching technique for irregular accesses to linked data structures , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[14]  James R. Larus,et al.  Making Pointer-Based Data Structures Cache Conscious , 2000, Computer.

[15]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[16]  Divyakant Agrawal,et al.  Hardware acceleration for spatial selections and joins , 2003, SIGMOD '03.

[17]  Erik J. Johnson,et al.  IXP2400/2800 Programming: The Complete Microengine Coding Guide , 2003 .

[18]  Anastasia Ailamaki,et al.  Improving hash join performance through prefetching , 2004, Proceedings. 20th International Conference on Data Engineering.

[19]  Divyakant Agrawal,et al.  Hardware Acceleration in Commercial Databases: A Case Study of Spatial Operations , 2004, VLDB.

[20]  Dinesh Manocha,et al.  Fast computation of database operations using graphics processors , 2005, SIGGRAPH Courses.