Main-memory index structures with fixed-size partial keys

The performance of main-memory index structures is increasingly determined by the number of CPU cache misses incurred when traversing the index. When keys are stored indirectly, as is standard in main-memory databases, the cost of key retrieval in terms of cache misses can dominate the cost of an index traversal. Yet it is inefficient in both time and space to store even moderate sized keys directly in index nodes. In this paper, we investigate the performance of tree structures suitable for OLTP workloads in the face of expensive cache misses and non-trivial key sizes. We propose two index structures, pkT-trees and pkB-trees, which significantly reduce cache misses by storing partial-key information in the index. We show that a small, fixed amount of key information allows most cache misses to be avoided, allowing for a simple node structure and efficient implementation. Finally, we study the performance and cache behavior of partial-key trees by comparing them with other main-memory tree structures for a wide variety of key sizes and key value distributions.

[1]  Chinya V. Ravishankar,et al.  Block-Oriented Compression Techniques for Large Statistical Databases , 1997, IEEE Trans. Knowl. Data Eng..

[2]  Hector Garcia-Molina,et al.  Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[3]  Jonathan Goldstein,et al.  Compressing relations and indexes , 1998, Proceedings 14th International Conference on Data Engineering.

[4]  Luis-Felipe Cabrera,et al.  An Evaluation of Starburst's Memory Resident Storage Component , 1992, IEEE Trans. Knowl. Data Eng..

[5]  Rudolf Bayer,et al.  Prefix B-trees , 1977, TODS.

[6]  S. Sudarshan,et al.  Dalí: A High Performance Main Memory Storage Manager , 1994, VLDB.

[7]  Kenneth A. Ross,et al.  Making B+-Trees Cache Conscious in Main Memory , 2000, SIGMOD Conference.

[8]  Kenneth A. Ross,et al.  Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.

[9]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[10]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[11]  Times-Ten Team,et al.  In-memory data management for consumer transactions the timesten approach , 1999, SIGMOD '99.

[12]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[13]  Brian N. Bershad,et al.  Reducing TLB and memory overhead using online superpage promotion , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[14]  Trevor N. Mudge,et al.  Virtual memory in contemporary microprocessors , 1998, IEEE Micro.

[15]  S. Sudarshan,et al.  DataBlitz: A High Performance Main-Memory Storage Manager , 1994, VLDB.

[16]  Michael J. Carey,et al.  A Study of Index Structures for a Main Memory Database Management System , 1986, HPTS.

[17]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.

[18]  S. Sudarshan,et al.  DataBlitz storage manager: main-memory database performance for critical applications , 1999, SIGMOD '99.

[19]  C. Mohan,et al.  ARIES/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes , 1990, VLDB.

[20]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  James R. Larus,et al.  Improving Pointer-Based Codes Through Cache-Conscious Data Placement , 1998 .

[23]  David E. Ferguson Bit-Tree: a data structure for fast file processing , 1992, CACM.

[24]  S. Sudarshan,et al.  Logical and Physical Versioning in Main Memory Databases , 1997, VLDB.