Efficient Stack Distance Computation for a Class of Priority Replacement Policies

The Linear-Scan algorithm (1970), applicable to priority replacement policies, computes stack distances and the number of misses incurred on a given address trace, for all cache sizes, in time O(V) per access. Here, V is the number of distinct (virtual) items referenced within the trace. While the time bound was subsequently lowered to O(log V) for the Least Recently Used policy, no improvements have been reported for general priority policies. This work introduces the class of policies with nearly static priorities (NSP), which encompasses several known policies. The Min-Tree algorithm is proposed for NSP policies, whose performance is quite sensitive to the policy as well as to the address trace. Under suitable probabilistic assumptions, the expected time per access is O(log2V). Experimental evidence collected on a mix of 30 benchmarks shows that the Min-Tree algorithm can be significantly faster than Linear-Scan, for interesting policies such as OPT (or Belady), Least Frequently Used (LFU), and Most Recently Used (MRU). Min-Tree can be parallelized to run in time O(log V) using O(V/log V) processors, in the worst case. A more sophisticated Lazy Min-Tree algorithm is also developed with $${O(\sqrt{V}\log V)}$$ worst-case time per access. This bound applies, in particular, to the policies OPT, LFU, and Least Recently/Frequently Used (LRFU), for which the best previously known bound was O(V). Although random replacement is not an NSP policy, the framework developed in this work leads to a stack-distance algorithm with O(log V) expected time per access.

[1]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[2]  Peter J. Denning,et al.  Experiments with program locality , 1899, AFIPS '72 (Fall, part I).

[3]  J. Wishart Statistical tables , 2018, Global Education Monitoring Report.

[4]  Gianfranco Bilardi,et al.  On approximating the ideal random access machine by physical machines , 2009, JACM.

[5]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[6]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[7]  Esslli Site,et al.  Models of Computation , 2012 .

[8]  Gianfranco Bilardi,et al.  An Optimal Stack Policy for Paging with Stochastic Inputs , 2011, ArXiv.

[9]  Sang Lyul Min,et al.  LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies , 2001, IEEE Trans. Computers.

[10]  Vincent J. Kruskal,et al.  LRU Stack Processing , 1975, IBM J. Res. Dev..

[11]  Gerhard Weikum,et al.  An optimality proof of the LRU-K page replacement algorithm , 1999, JACM.

[12]  Santosh G. Abraham,et al.  Efficient simulation of caches under optimal replacement with applications to miss characterization , 1993, SIGMETRICS '93.

[13]  Peter A. Franaszek,et al.  Some Distribution-Free Aspects of Paging Algorithm Performance , 1974, JACM.

[14]  Bowen Alpern,et al.  A model for hierarchical memory , 1987, STOC.

[15]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[16]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[17]  Gerald S. Shedler,et al.  A model of memory contention in a paging machine , 1972, CACM.

[18]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[19]  R. A. Fisher,et al.  Statistical Tables for Biological, Agricultural and Medical Research , 1956 .

[20]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[21]  Abraham Silberschatz,et al.  Operating System Concepts , 1983 .

[22]  C. Cascaval,et al.  Calculating stack distances efficiently , 2003, MSP '02.

[23]  Steven A. Przybylski,et al.  Cache and memory hierarchy design , 1990 .

[24]  John A. Fotheringham,et al.  Dynamic storage allocation in the Atlas computer, including an automatic use of a backing store , 1961, Commun. ACM.

[25]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[26]  F. Yates,et al.  Statistical Tables for Biological, Agricultural and Medical Research. , 1939 .

[27]  Steven A. Przybylski,et al.  Cache and memory hierarchy design: a performance-directed approach , 1990 .

[28]  David A. Wood,et al.  Active memory: a new abstraction for memory-system simulation , 1995, SIGMETRICS '95/PERFORMANCE '95.

[29]  John E. Savage,et al.  Models of computation - exploring the power of computing , 1998 .

[30]  Lixin Zhang,et al.  Mambo: a full system simulator for the PowerPC architecture , 2004, PERV.