Effect of Node Size on the Performance of Cache-Conscious Indices

In main-memory environments, the number of processor cache misses has a critical impact on the performance of the system. Cache-conscious indices are designed to improve the performance of mainmemory indices by reducing the number of processor cache misses that are incurred during a search operation. Conventional wisdom suggests that the index’s node size should be equal to the cache line size in order to minimize the number of cache misses and improve performance. As we show in this paper, this design choice ignores additional effects, such as instruction count, which play a significant role in determining the overall performance of the index. Using analytical models and a detailed experimental evaluation, we investigate the effect of the index’s node size on two common cache-conscious indices: a cache-conscious B+-tree (CSB+-tree), and a cache-conscious extendible hash index. We show that using node sizes much larger than the cache line size can result in better search performance for the CSB+-tree. For the hash index, reducing the number of overflow chains is the key to improving search performance, even if it requires using a node size that is much larger than the cache line size. Extensive experimental evaluation demonstrates that these node size choices are valid for a variety of data distributions, for range searches on the CSB+-tree, and can also be used to speed up the execution of traditional hash-based query operators that use memory-resident indices internally.

[1]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[2]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[3]  Alfonso F. Cardenas Analysis and performance of inverted data base structures , 1975, CACM.

[4]  Prashant Palvia,et al.  Approximating Block Accesses in Database Organizations , 1984, Inf. Process. Lett..

[5]  Michael J. Carey,et al.  A Study of Index Structures for a Main Memory Database Management System , 1986, HPTS.

[6]  Leonard D. Shapiro,et al.  Join processing in database systems with large main memories , 1986, TODS.

[7]  Hansjörg Zeller,et al.  An Adaptive Hash Join Algorithm for Multiuser Environments , 1990, VLDB.

[8]  David J. DeWitt,et al.  The Wisconsin Benchmark: Past, Present, and Future , 1991, The Benchmark Handbook.

[9]  Jeffrey F. Naughton,et al.  Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.

[10]  Carolyn Turbyfill,et al.  A retrospective on the Wisconsin Benchmark , 1994 .

[11]  Goetz Graefe,et al.  The Five Minute Rule, Ten Years Later , 1997 .

[12]  Goetz Graefe,et al.  The five-minute rule ten years later, and other computer storage rules of thumb , 1997, SGMD.

[13]  Michael Stonebraker,et al.  The Asilomar report on database research , 1998, SGMD.

[14]  Goetz Graefe,et al.  Hash Joins and Hash Teams in Microsoft SQL Server , 1998, VLDB.

[15]  David B. Lomet,et al.  B-tree page size when caching is considered , 1998, SGMD.

[16]  S. Parekh,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[17]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[18]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[19]  Kenneth A. Ross,et al.  Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.

[20]  James R. Larus,et al.  Cache-conscious structure layout , 1999, PLDI '99.

[21]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[22]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[23]  Kihong Kim,et al.  Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared-Memory Multiprocessor Systems , 2001, VLDB.

[24]  Todd C. Mowry,et al.  Improving index performance through prefetching , 2001, SIGMOD '01.

[25]  Rajeev Rastogi,et al.  Main-memory index structures with fixed-size partial keys , 2001, SIGMOD '01.

[26]  Kihong Kim,et al.  Optimizing multidimensional index trees for main memory access , 2001, SIGMOD '01.

[27]  Gary Valentin,et al.  Fractal prefetching B+-Trees: optimizing both cache and disk performance , 2002, SIGMOD '02.

[28]  Jignesh M. Patel,et al.  Effect of node size on the performance of cache-conscious B+-trees , 2003, SIGMETRICS '03.