Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications

Run-time compilation systems are challenged with the task of translating a program's instruction stream while maintaining low overhead. While software managed code caches are utilized to amortize translation costs, they are ineffective for programs with short run times or large amounts of cold code. Such program characteristics are prevalent in real-life computing environments, ranging from graphical user interface (GUI) programs to large-scale applications such as database management systems. Persistent code caching addresses these issues. It is described and evaluated in an industry-strength dynamic binary instrumentation system - Pin. The proposed approach improves the intra-execution model of code reuse by storing and reusing translations across executions, thereby achieving inter-execution persistence. Dynamically linked programs leverage inter-application persistence by using persistent translations of library code generated by other programs. New translations discovered across executions are automatically accumulated into the persistent code caches, thereby improving performance over time. Inter-execution persistence improves the performance of GUI applications by nearly 90%, while inter-application persistence achieves a 59% improvement. In more specialized uses, the SPEC2K INT benchmark suite experiences a 26% improvement under dynamic binary instrumentation. Finally, a 400% speedup is achieved in translating the Oracle database in a regression testing environment

[1]  Barton P. Miller,et al.  Fine-grained dynamic instrumentation of commodity operating system kernels , 1999, OSDI '99.

[2]  Raymond J. Hookway,et al.  DIGITAL FX!32: Combining Emulation and Binary Translation , 1997, Digit. Tech. J..

[3]  Michael D. Smith,et al.  Characterizing Inter-Execution and Inter-Application Optimization Persistence , 2003 .

[4]  Peng Zhang,et al.  Module-aware translation for real-life desktop applications , 2005, VEE '05.

[5]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[6]  Tony Field,et al.  GILK: A Dynamic Instrumentation Tool for the Linux Kernel , 2002, Computer Performance Evaluation / TOOLS.

[7]  Michael D. Smith,et al.  Code cache management in dynamic optimization systems , 2004 .

[8]  Sumedh W. Sathaye,et al.  Dynamic rescheduling: a technique for object code compatibility in VLIW architectures , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[9]  David W. Wall Predicting program behavior using real or estimated profiles , 1991, PLDI '91.

[10]  Paolo Faraboschi,et al.  DELI: a new run-time control point , 2002, MICRO.

[11]  Jonathan S. Shapiro,et al.  HDTrans: a low-overhead dynamic translator , 2007, CARN.

[12]  Bruce R. Childers,et al.  Compact binaries with code compression in a software dynamic translator , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[13]  Sorin Lerner,et al.  Mojo: A Dynamic Optimization System , 2000 .

[14]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[15]  Serap A. Savari,et al.  Comparing and Combining Profiles , 2000, J. Instr. Level Parallelism.

[16]  Cheng Wang,et al.  Software-based transparent and comprehensive control-flow error detection , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[17]  Dirk Grunwald,et al.  Performance issues in correlated branch prediction schemes , 1995, MICRO 1995.

[18]  Robert S. Cohn,et al.  Optimizing Alpha Executables on Windows NT with Spike , 1998, Digit. Tech. J..

[19]  Peter Feller,et al.  Value Profiling for Instructions and Memory Locations , 1998 .

[20]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[21]  Wei-Chung Hsu,et al.  The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System , 2003, MICRO.

[22]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[23]  Vasanth Bala,et al.  Efficient instruction scheduling using finite state automata , 1995, MICRO 1995.

[24]  Kerstin Eder,et al.  International Symposium on Code Generation and Optimization. CGO 2003 , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[25]  Derek Bruening,et al.  Secure Execution via Program Shepherding , 2002, USENIX Security Symposium.

[26]  Apala Guha,et al.  Reducing Exit Stub Memory Consumption in Code Caches , 2007, HiPEAC.

[27]  Cindy Zheng,et al.  PA-RISC to IA-64: Transparent Execution, No Recompilation , 2000, Computer.

[28]  Derek Bruening,et al.  An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[29]  Dirk Grunwald,et al.  The predictability of branches in libraries , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[30]  Samuel P. Midkiff,et al.  Quicksilver: a quasi-static compiler for Java , 2000, OOPSLA '00.

[31]  Kevin Skadron,et al.  Low-overhead Software Dynamic Translation , 2001 .

[32]  Erik R. Altman,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[33]  Wei-Chung Hsu,et al.  The performance of runtime data cache prefetching in a dynamic optimization system , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..