Improving Startup Performance in Dynamic Binary Translators

A Dynamic Binary Translation (DBT) system dynamically translates program binaries built for a guest platform into code for the host machine that runs the program, one basic block at a time. Even after optimizations, auxiliary tasks performed alongside program emulation by the DBT system introduce performance overheads as compared to executing the program on the native guest platform. In this work, we analyze the extent and causes for a DBT system's startup performance latency. We then focus on understanding and alleviating the program translation cost that is a significant contributor to and disproportionately impacts the startup overhead. We propose and assess the potential of a new technique that parallelizes program translations on multi-core machines to reduce its evident run-time costs. We explain the challenges in achieving such parallelization and discuss and evaluate solutions.

[1]  Stéphane Ducasse,et al.  Sista: Saving Optimized Code in Snapshots for Fast Start-Up , 2017, ManLang.

[2]  Yun Wang,et al.  IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium-based systems , 2003, MICRO.

[3]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[4]  M. Luján,et al.  Low overhead dynamic binary translation on ARM , 2017, PLDI.

[5]  John Yates,et al.  FX!32 a profile-directed binary translator , 1998, IEEE Micro.

[6]  James E. Smith,et al.  Dynamic binary translation for accumulator-oriented architectures , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[7]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[8]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[9]  Cliff Click,et al.  The Java HotSpot Server Compiler , 2001, Java Virtual Machine Research and Technology Symposium.

[10]  Jianhui Li,et al.  Metadata driven memory optimizations in dynamic binary translator , 2007, VEE '07.

[11]  Thomas R. Gross,et al.  Fine-grained user-space security through virtualization , 2011, VEE '11.

[12]  Derek Bruening,et al.  Efficient, transparent, and comprehensive runtime code manipulation , 2004 .

[13]  Arnaldo Carvalho de Melo,et al.  The New Linux ’ perf ’ Tools , 2010 .

[14]  Stephen McCamant,et al.  A General Persistent Code Caching Framework for Dynamic Binary Translation (DBT) , 2016, USENIX Annual Technical Conference.

[15]  Vasanth Bala,et al.  Dynamo: a transparent dynamic optimization system , 2000, SIGP.

[16]  Yun Wang,et al.  IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[17]  Jack W. Davidson,et al.  Secure and practical defense against code-injection attacks using software dynamic translation , 2006, VEE '06.

[18]  Angela Demke Brown,et al.  Comprehensive kernel instrumentation via dynamic binary translation , 2012, ASPLOS XVII.

[19]  Michael D. Smith,et al.  Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[20]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[21]  Derek Bruening,et al.  Process-shared and persistent code caches , 2008, VEE '08.

[22]  Chien-Min Wang,et al.  HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores , 2012, CGO '12.

[23]  Kim M. Hazelwood,et al.  Evaluating the impact of dynamic binary translation systems on hardware cache performance , 2008, 2008 IEEE International Symposium on Workload Characterization.

[24]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[25]  Toshio Endo,et al.  ExanaDBT: A Dynamic Compilation System for Transparent Polyhedral Optimizations at Runtime , 2017, Conf. Computing Frontiers.