Cache-Adaptive Analysis

Memory efficiency and locality have substantial impact on the performance of programs, particularly when operating on large data sets. Thus, memory- or I/O-efficient algorithms have received significant attention both in theory and practice. The widespread deployment of multicore machines, however, brings new challenges. Specifically, since the memory (RAM) is shared across multiple processes, the effective memory-size allocated to each process fluctuates over time. This paper presents techniques for designing and analyzing algorithms in a cache-adaptive setting, where the RAM available to the algorithm changes over time. These techniques make analyzing algorithms in the cache-adaptive model almost as easy as in the external memory, or DAM model. Our techniques enable us to analyze a wide variety of algorithms --- Master-Method-style algorithms, Akra-Bazzi-style algorithms, collections of mutually recursive algorithms, and algorithms, such as FFT, that break problems of size N into subproblems of size Theta(Nc). We demonstrate the effectiveness of these techniques by deriving several results: 1. We give a simple recipe for determining whether common divide-and-conquer cache-oblivious algorithms are optimally cache adaptive. 2. We show how to bound an algorithm's non-optimality. We give a tight analysis showing that a class of cache-oblivious algorithms is a logarithmic factor worse than optimal. 3. We show the generality of our techniques by analyzing the cache-oblivious FFT algorithm, which is not covered by the above theorems. Nonetheless, the same general techniques can show that it is at most O(loglog N) away from optimal in the cache adaptive setting, and that this bound is tight. These general theorems give concrete results about several algorithms that could not be analyzed using earlier techniques. For example, our results apply to Fast Fourier Transform, matrix multiplication, Jacobi Multipass Filter, and cache-oblivious dynamic-programming algorithms, such as Longest Common Subsequence and Edit Distance. Our results also give algorithm designers clear guidelines for creating optimally cache-adaptive algorithms.

[1]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[2]  Dror Irony,et al.  Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..

[3]  Louay Bazzi,et al.  On the Solution of Linear Recurrence Equations , 1998, Comput. Optim. Appl..

[4]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[5]  Michael A. Bender,et al.  Cache-Adaptive Algorithms , 2014, SODA.

[6]  Rakesh D. Barve,et al.  External Memory Algorithms with Dynamically Changing Memory Allocations . , 1998 .

[7]  M. Livny,et al.  Partially Preemptive Hash Joins , 1993, SIGMOD Conference.

[8]  Laszlo A. Belady,et al.  An anomaly in space-time characteristics of certain programs running in a paging machine , 1969, CACM.

[9]  Miron Livny,et al.  Memory-Adaptive External Sorting , 1993, VLDB.

[10]  Per-Åke Larson,et al.  A memory-adaptive sort (MASORT) for database systems , 1996, CASCON.

[11]  Vijaya Ramachandran,et al.  Cache-oblivious dynamic programming , 2006, SODA '06.

[12]  Dimitrios S. Nikolopoulos,et al.  Adapting to memory pressure from within scientific applications on multiprogrammed COWs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[13]  Jeffrey Scott Vitter,et al.  Algorithms and Data Structures for External Memory , 2008, Found. Trends Theor. Comput. Sci..

[14]  Richard T. Mills,et al.  Dynamic adaptation to cpu and memory load in scientific applications , 2004 .

[15]  Hansjörg Zeller,et al.  An Adaptive Hash Join Algorithm for Multiuser Environments , 1990, VLDB.

[16]  References , 1971 .

[17]  Antal Iványi,et al.  FIFO anomaly is unbounded , 2010, ArXiv.

[18]  John E. Savage,et al.  Models of computation - exploring the power of computing , 1998 .

[19]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[20]  H. T. Kung,et al.  I/O complexity: The red-blue pebble game , 1981, STOC '81.

[21]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[22]  Miron Livny,et al.  Managing Memory to Meet Multiclass Workload Response Time Goals , 1993, VLDB.

[23]  Gerth Stølting Brodal,et al.  Cache Oblivious Distribution Sweeping , 2002, ICALP.

[24]  Jeffrey Scott Vitter,et al.  A theoretical framework for memory-adaptive algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[25]  Per-Åke Larson,et al.  Dynamic Memory Adjustment for External Mergesort , 1997, VLDB.