Performance of large low-associativity caches

While it is known that lowering the associativity of caches degrades cache performance, little is understood about the degree of this effect or how to lessen the effect, especially in very large caches. Most existing works on cache performance are simulation or emulation based and there is a lack of analytical\ models characterizing performance in terms of different configuration parameters such as line size, cache size, associativity and workload specific parameters. We develop analytical models to study performance of large cache architectures by capturing the dependence of miss ratio on associativity and other configuration parameters. While high associativity may decrease cache misses, for very large caches the corresponding increase in hardware cost and power may be significant. We use our models as well as simulation to study different proposals for reducing misses in low associativity caches, specifically, address space randomization and victim caches. Our analysis provides specific detail on the impact of these proposals, and a clearer understanding of why they do or do not work.

[1]  Mikko H. Lipasti,et al.  A performance methodology for commercial servers , 2000, IBM J. Res. Dev..

[2]  Lixin Zhang,et al.  Mambo: a full system simulator for the PowerPC architecture , 2004, PERV.

[3]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[4]  Ramendra K. Sahoo,et al.  MemorIES: a programmable, real-time hardware emulation tool for multiprocessor server design , 2000, SIGP.

[5]  Vijayalakshmi Srinivasan,et al.  On the Nature of Cache Miss Behavior: Is It √2? , 2008, J. Instr. Level Parallelism.

[6]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Koen De Bosschere,et al.  On Generating Set Index Functions for Randomized Caches , 2004, Comput. J..

[8]  Bruce McNutt,et al.  The Fractal Structure of Data Reference , 2002, Advances in Database Systems.

[9]  Shih-Lien Lu,et al.  Implementation of HW$im - A Real-Time Configurable Cache Simulator , 2003, FPL.

[10]  Babak Falsafi,et al.  A complexity-effective architecture for accelerating full-system multiprocessor simulations using FPGAs , 2008, FPGA '08.

[11]  Joel L. Wolf,et al.  Synthetic Traces for Trace-Driven Simulation of Cache Memories , 1992, IEEE Trans. Computers.

[12]  Shih-Lien Lu,et al.  Real-time L3 cache simulations using the Programmable Hardware-Assisted Cache Emulator (PHA$E) , 2003, 2003 IEEE International Conference on Communications (Cat. No.03CH37441).

[13]  David A. Padua,et al.  Calculating stack distances efficiently , 2002, MSP/ISMM.

[14]  Dong Liu,et al.  Modeling and evaluating heterogeneous memory architectures by trace-driven simulation , 2008, MAW '08.

[15]  Peter A. Franaszek,et al.  Victim management in a cache hierarchy , 2006, IBM J. Res. Dev..

[16]  Bruce McNutt,et al.  A Simple Statistical Model of Cache Reference Locality, and its Application to Cache Planning, Measurement and Control , 1991, Int. CMG Conference.

[17]  David A. Wood,et al.  Full-system timing-first simulation , 2002, SIGMETRICS '02.

[18]  Alan Jay Smith,et al.  Evaluating Associativity in CPU Caches , 1989, IEEE Trans. Computers.

[19]  Mateo Valero,et al.  Eliminating cache conflict misses through XOR-based placement functions , 1997, ICS '97.

[20]  Trevor N. Mudge,et al.  Trace-driven memory simulation: a survey , 1997, CSUR.

[21]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[22]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[23]  Yu Zhang,et al.  Parallelization of IBM mambo system simulator in functional modes , 2008, OPSR.

[24]  Michel Dubois,et al.  RPM: A Rapid Prototyping Engine for Multiprocessor Systems , 1995, Computer.

[25]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .