Cache Conscious Indexing for Decision-Support in Main Memory

We study indexing techniques for main memory, including hash indexes, binary search trees, T-trees, B+-trees, interpolation search, and binary search on arrays. In a decision-support context, our primary concerns are the lookup time, and the space occupied by the index structure. Our goal is to provide faster lookup times than binary search by paying attention to reference locality and cache behavior, without using substantial extra space. We propose a new indexing technique called \Cache-Sensitive Search Trees" (CSS-trees). Our technique stores a directory structure on top of a sorted array. Nodes in this directory have size matching the cache-line size of the machine. We store the directory in an array and do not store internal-node pointers; child nodes can be found by performing arithmetic on array osets. We compare the algorithms based on their time and space requirements. We have implemented all of the techniques, and present a performance study on two popular modern machines. We demonstrate that with This research was supported by a David and Lucile Packard Foundation Fellowship in Science and Engineering, by an NSF Young Investigator Award, by NSF grant number IIS-98-12014, and by NSF CISE award CDA-9625374.

[1]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[2]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[3]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[4]  Francesca Cesarini,et al.  An Algorithm to Construct a Compact B-Tree in Case of Ordered Keys , 1983, Inf. Process. Lett..

[5]  Ravi Krishnamurthy,et al.  Design of a Memory Resident DBMS , 1985, IEEE Computer Society International Conference.

[6]  Michael J. Carey,et al.  A Study of Index Structures for a Main Memory Database Management System , 1986, HPTS.

[7]  Michael J. Carey,et al.  Query processing in main memory database management systems , 1986, SIGMOD '86.

[8]  Robert B. Hagmann A Crash Recovery Scheme for a Memory-Resident Database System , 1986, IEEE Transactions on Computers.

[9]  A Recovery Algorithm for A High-Performance Memory-Resident Database System , 1987, SIGMOD Conference.

[10]  Michael J. Carey,et al.  A recovery algorithm for a high-performance memory-resident database system , 1987, SIGMOD '87.

[11]  Reind P. van de Riet,et al.  Two Access Methods Using Compact Binary Trees , 1987, IEEE Transactions on Software Engineering.

[12]  Jeffrey F. Naughton,et al.  Multiprocessor Main Memory Transaction Processing , 1988, Proceedings [1988] International Symposium on Databases in Parallel and Distributed Systems.

[13]  Ravi Krishnamurthy,et al.  Query optimization in a memory-resident domain relational calculus database system , 1990, TODS.

[14]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.

[15]  Hector Garcia-Molina,et al.  Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[16]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[17]  Luis-Felipe Cabrera,et al.  An Evaluation of Starburst's Memory Resident Storage Component , 1992, IEEE Trans. Knowl. Data Eng..

[18]  Jeffrey F. Naughton,et al.  Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.

[19]  S. Sudarshan,et al.  Dalí: A High Performance Main Memory Storage Manager , 1994, VLDB.

[20]  David B. Lomet,et al.  AlphaSort: a RISC machine sort , 1994, SIGMOD '94.

[21]  Clark D. French,et al.  “One size fits all” database architectures do not work for DSS , 1995, SIGMOD '95.

[22]  "One Size Fits All" Database Architectures Do Not Work for DDS , 1995, SIGMOD Conference.

[23]  Richard E. Ladner,et al.  The influence of caches on the performance of heaps , 1996, JEAL.

[24]  Clark D. French Teaching an OLTP database kernel advanced datawarehousing techniques , 1997, Proceedings 13th International Conference on Data Engineering.

[25]  Richard E. Ladner,et al.  The influence of caches on the performance of sorting , 1997, SODA '97.

[26]  Michael Stonebraker,et al.  The Asilomar report on database research , 1998, SGMD.

[27]  S. Sudarshan,et al.  DataBlitz: A High Performance Main-Memory Storage Manager , 1994, VLDB.

[28]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[29]  Goetz Graefe,et al.  Hash Joins and Hash Teams in Microsoft SQL Server , 1998, VLDB.

[30]  James R. Larus,et al.  Improving Pointer-Based Codes Through Cache-Conscious Data Placement , 1998 .

[31]  David Thomas,et al.  The Art in Computer Programming , 2001 .