Managing bounded code caches in dynamic binary optimization systems

Dynamic binary optimizers store altered copies of original program instructions in software-managed code caches in order to maximize reuse of transformed code. Code caches store code blocks that may vary in size, reference other code blocks, and carry a high replacement overhead. These unique constraints reduce the effectiveness of conventional cache management policies. Our work directly addresses these unique constraints and presents several contributions to the code-cache management problem. First, we show that evicting more than the minimum number of code blocks from the code cache results in less run-time overhead than the existing alternatives. Such granular evictions reduce overall execution time, as the fixed costs of invoking the eviction mechanism are amortized across multiple cache insertions. Second, a study of the ideal lifetimes of dynamically generated code blocks illustrates the benefit of a replacement algorithm based on a generational heuristic. We describe and evaluate a generational approach to code cache management that makes it easy to identify long-lived code blocks and simultaneously avoid any fragmentation because of the eviction of short-lived blocks. Finally, we present results from an implementation of our generational approach in the DynamoRIO framework and illustrate that, as dynamic optimization systems become more prevalent, effective code cache-management policies will be essential for reliable, scalable performance of modern applications.

[1]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[2]  Jack J. Dongarra,et al.  End-user Tools for Application Performance Analysis Using Hardware Counters , 2001, ISCA PDCS.

[3]  Cristina Cifuentes,et al.  Walkabout: a retargetable dynamic binary translation framework , 2002 .

[4]  Erik R. Altman,et al.  Advances and future challenges in binary translation and optimization , 2001, Proc. IEEE.

[5]  Mateo Valero,et al.  Trace cache redundancy: red and blue traces , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[6]  Richard Johnson,et al.  The Transmeta Code Morphing#8482; Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, CGO.

[7]  David M. Ungar,et al.  Generation Scavenging: A non-disruptive high performance storage reclamation algorithm , 1984, SDE 1.

[8]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.

[9]  Erik R. Altman,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[10]  A. Meucci Risk and asset allocation , 2005 .

[11]  Kim Hazelwood,et al.  Generational cache management of code traces in dynamic optimization systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[12]  David Keppel,et al.  Shade: a fast instruction-set simulator for execution profiling , 1994, SIGMETRICS.

[13]  Evelyn Duesterwald,et al.  Design and implementation of a dynamic optimization framework for windows , 2000 .

[14]  Mendel Rosenblum,et al.  Embra: fast and flexible machine simulation , 1996, SIGMETRICS '96.

[15]  Derek Bruening,et al.  An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[16]  Erik R. Altman,et al.  BOA: The Architecture of a Binary Translation Processor , 1999 .

[17]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[18]  D SmithMichael,et al.  Managing bounded code caches in dynamic binary optimization systems , 2006 .

[19]  Michael D. Smith,et al.  Generational Cache Management of Code Traces in Dynamic Optimization Systems , 2003, MICRO.

[20]  Mary Lou Soffa,et al.  Retargetable and reconfigurable software dynamic translation , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[21]  Paolo Faraboschi,et al.  DELI: a new run-time control point , 2002, MICRO.

[22]  A. Stuart,et al.  Portfolio Selection: Efficient Diversification of Investments. , 1960 .

[23]  Armand M. Makowski,et al.  Optimal replacement policies for nonuniform cache objects with optional eviction , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[24]  Andrew W. Appel,et al.  Simple generational garbage collection and fast allocation , 1989, Softw. Pract. Exp..

[25]  Wei-Chung Hsu,et al.  Continuous Adaptive Object-Code Re-optimization Framework , 2004, Asia-Pacific Computer Systems Architecture Conference.

[26]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[27]  Narayanan Vijaykrishnan,et al.  Energy-aware code cache management for memory-constrained Java devices , 2003, IEEE International [Systems-on-Chip] SOC Conference, 2003. Proceedings..

[28]  James E. Smith,et al.  Exploring code cache eviction granularities in dynamic optimization systems , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[29]  Raymond J. Hookway,et al.  DIGITAL FX!32: Combining Emulation and Binary Translation , 1997, Digit. Tech. J..

[30]  Sorin Lerner,et al.  Mojo: A Dynamic Optimization System , 2000 .

[31]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[32]  Avi Mendelson,et al.  Filtering techniques to improve trace-cache efficiency , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[33]  Jia Wang,et al.  A survey of web caching schemes for the Internet , 1999, CCRV.

[34]  Vasanth Bala,et al.  Transparent Dynamic Optimization , 1999 .

[35]  Michael D. Smith,et al.  Code cache management schemes for dynamic optimizers , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

[36]  W. Ziemba,et al.  Worldwide asset and liability modeling , 1998 .

[37]  David J. Sager,et al.  The microarchitecture of the Pentium 4 processor , 2001 .

[38]  David Ungar Generation scavenging: a nondisruptive high performance storage reclamation algorithm , 1984 .

[39]  E. Elton Modern portfolio theory and investment analysis , 1981 .

[40]  Michael Gschwind,et al.  Dynamic Binary Translation and Optimization , 2001, IEEE Trans. Computers.

[41]  Paolo Faraboschi,et al.  DELI: a new run-time control point , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[42]  Chandra Krintz,et al.  The design, implementation, and evaluation of adaptive code unloading for resource-constrained devices , 2005, TACO.

[43]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[44]  R. Nair,et al.  Exploiting Instruction Level Parallelism In Processors By Caching Scheduled Groups , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[45]  Michael D. Smith,et al.  Code cache management in dynamic optimization systems , 2004 .

[46]  Ana Pont,et al.  The filter cache: a run-time cache management approach , 1999, Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium.

[47]  Mary Lou Soffa,et al.  Planning for code buffer management in distributed virtual execution environments , 2005, VEE '05.