Process-shared and persistent code caches

Software code caches are increasingly being used to amortizethe runtime overhead of tools such as dynamic optimizers, simulators, and instrumentation engines. The additional memory consumed by these caches, along with the data structures used to manage them, limits the scalability of dynamic tool deployment. Inter-process sharing of code caches significantly improves the ability to efficiently apply code caching tools to many processes simultaneously. In this paper, we present a method of code cache sharing among processes for dynamic tools operating on native applications. Our design also supports code cache persistence for improved cold code execution in short-lived processes or long initialization sequences. Sharing raises security concerns, and we show how to achieve sharing without risk of privilege escalation and with read-only code caches and associated data structures. We evaluate process-shared and persisted code caches implemented in the DynamoRIO industrial-strength dynamic instrumentation engine, where we achieve a two-thirds reduction in both memory usage and startup time.

[1]  Cindy Zheng,et al.  PA-RISC to IA-64: Transparent Execution, No Recompilation , 2000, Computer.

[2]  L. Peter Deutsch,et al.  Efficient implementation of the smalltalk-80 system , 1984, POPL.

[3]  Jonathan S. Shapiro,et al.  HDTrans: an open source, low-level dynamic instrumentation system , 2006, VEE '06.

[4]  Mary Lou Soffa,et al.  Retargetable and reconfigurable software dynamic translation , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[5]  Håkan Grahn,et al.  SimICS/Sun4m: A Virtual Workstation , 1998, USENIX Annual Technical Conference.

[6]  David Keppel,et al.  Shade: a fast instruction-set simulator for execution profiling , 1994, SIGMETRICS.

[7]  Ali-Reza Adl-Tabatabai,et al.  Fast, effective code generation in a just-in-time Java compiler , 1998, PLDI.

[8]  Michael D. Smith,et al.  Characterizing Inter-Execution and Inter-Application Optimization Persistence , 2003 .

[9]  James R. Larus,et al.  EEL: machine-independent executable editing , 1995, PLDI '95.

[10]  Peng Zhang,et al.  Module-aware translation for real-life desktop applications , 2005, VEE '05.

[11]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[12]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[13]  Urs Hölzle,et al.  Adaptive optimization for self: reconciling high performance with exploratory programming , 1994 .

[14]  Vijay Janapa Reddi,et al.  Persistence in dynamic code transformation systems , 2005, CARN.

[15]  Amitabh Srivastava,et al.  Vulcan Binary transformation in a distributed environment , 2001 .

[16]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[17]  Sorin Lerner,et al.  Mojo: A Dynamic Optimization System , 2000 .

[18]  Cristina Cifuentes,et al.  Walkabout: a retargetable dynamic binary translation framework , 2002 .

[19]  Derek Bruening,et al.  Maintaining consistency and bounding capacity of software code caches , 2005, International Symposium on Code Generation and Optimization.

[20]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[21]  Marianne Shaw,et al.  Denali: Lightweight Virtual Machines for Distributed and Networked Applications , 2001 .

[22]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[23]  Erik R. Altman,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[24]  Zheng Wang,et al.  System support for automatic profiling and optimization , 1997, SOSP.

[25]  Mendel Rosenblum,et al.  Embra: fast and flexible machine simulation , 1996, SIGMETRICS '96.

[26]  Nathaniel Nystrom,et al.  Code Sharing among Virtual Machines , 2002, ECOOP.

[27]  Paolo Faraboschi,et al.  DELI: a new run-time control point , 2002, MICRO.

[28]  Matthew Arnold,et al.  Adaptive optimization in the Jalapeño JVM , 2000, OOPSLA '00.

[29]  Michael D. Smith,et al.  Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[30]  Andrew Kennedy,et al.  Combining Generics, Pre-compilation and Sharing Between Software-Based Processes , 2004 .

[31]  Alec Wolman,et al.  Instrumentation and optimization of Win32/intel executables using Etch , 1997 .

[32]  John Yates,et al.  FX!32 a profile-directed binary translator , 1998, IEEE Micro.

[33]  Chi-Keung Luk,et al.  PinOS: a programmable framework for whole-system dynamic instrumentation , 2007, VEE '07.

[34]  Derek Bruening,et al.  Efficient, transparent, and comprehensive runtime code manipulation , 2004 .