Cache Oblivious Algorithms

The cache oblivious model is a simple and elegant model to design algorithms that perform well in hierarchical memory models ubiquitous on current systems. This model was first formulated in [321] and has since been a topic of intense research. Analyzing and designing algorithms and data structures in this model involves not only an asymptotic analysis of the number of steps executed in terms of the input size, but also the movement of data optimally among the different levels of the memory hierarchy. This chapter is aimed as an introduction to the “ideal-cache” model of [321] and techniques used to design cache oblivious algorithms. The chapter also presents some experimental insights and results.

[1]  Olivier Temam,et al.  Cache interference phenomena , 1994, SIGMETRICS.

[2]  David S. Wise Ahnentafel Indexing into Morton-Ordered Arrays, or Matrix Locality for Free , 2000, Euro-Par.

[3]  Richard E. Ladner,et al.  A Comparison of Cache Aware and Cache Oblivious Static Search Trees Using Program Instrumentation , 2000, Experimental Algorithmics.

[4]  Guy E. Blelloch,et al.  A comparison of sorting algorithms for the connection machine CM-2 , 1991, SPAA '91.

[5]  Alok Aggarwal,et al.  Hierarchical memory with block transfer , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[6]  John E. Savage Extending the Hong-Kung Model to Memory Hierarchies , 1995, COCOON.

[7]  Edward F. Grove,et al.  External-memory graph algorithms , 1995, SODA '95.

[8]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[9]  Jing Wu,et al.  A locality-preserving cache-oblivious dynamic dictionary , 2002, SODA '02.

[10]  Richard Cole,et al.  Optimised Predecessor Data Structures for Internal Memory , 2001, WAE.

[11]  Matteo Frigo,et al.  A fast Fourier transform compiler , 1999, SIGP.

[12]  Gerth Stølting Brodal,et al.  Funnel Heap - A Cache Oblivious Priority Queue , 2002, ISAAC.

[13]  Matteo Frigo,et al.  Cache-oblivious algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[14]  Bowen Alpern,et al.  A model for hierarchical memory , 1987, STOC.

[15]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[16]  Sivan Toledo Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..

[17]  Gerth Stølting Brodal,et al.  Cache oblivious search trees via binary trees of small height , 2001, SODA '02.

[18]  Richard E. Ladner,et al.  The influence of caches on the performance of sorting , 1997, SODA '97.

[19]  Sandeep Sen,et al.  Towards a theory of cache-efficient algorithms , 2000, SODA '00.

[20]  David B. Lomet,et al.  AlphaSort: a RISC machine sort , 1994, SIGMOD '94.

[21]  V. Strassen Gaussian elimination is not optimal , 1969 .

[22]  Alok Aggarwal,et al.  Virtual memory algorithms , 1988, STOC '88.

[23]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[24]  H. T. Kung,et al.  I/O complexity: The red-blue pebble game , 1981, STOC '81.

[25]  Ken Kennedy,et al.  Transforming loops to recursion for multi-level memory hierarchies , 2000, PLDI '00.

[26]  Nancy M. Amato,et al.  On computing Voronoi diagrams by divide-prune-and-conquer , 1996, SCG '96.

[27]  Michael A. Bender,et al.  Cache-oblivious priority queue and graph algorithm applications , 2002, STOC '02.

[28]  Bowen Alpern,et al.  Uniform memory hierarchies , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[29]  Siddhartha Chatterjee,et al.  Cache-efficient matrix transposition , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[30]  Michael A. Bender,et al.  Cache-oblivious B-trees , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[31]  Matteo Frigo,et al.  Portable high-performance programs , 1999 .

[32]  Michael Rodeh,et al.  Matrix Multiplication: A Case Study of Algorithm Engineering , 1998, WAE.

[33]  Frank Thomson Leighton,et al.  Tight Bounds on the Complexity of Parallel Sorting , 1984, IEEE Transactions on Computers.

[34]  Alon Itai,et al.  A Sparse Table Implementation of Priority Queues , 1981, ICALP.

[35]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[36]  Thomas H. Cormen,et al.  Columnsort lives! an efficient out-of-core sorting program , 2001, SPAA '01.