An approach to minimizing the interpretation overhead in Dynamic Binary Translation

Dynamic Binary Translation (DBT) has been widely utilized to convert binary code for one Instruction Set Architecture (ISA) to another at run-time and optimize the code when necessary. A two-stage strategy often applies to DBT, which handles hot code and cold code separately using translation and interpretation respectively to ensure execution efficiency. However, an excessively high overhead of interpretation remains to be tackled. It has been observed that interpretation usually involves a large number of redundant redecoding operations. This paper introduces an approach, namely Decoded Instruction Cache (DICache), which caches the information of the interpreted instructions in the history and attempts to reuse the information as much as possible in the future. Performance benchmark has been carried out with the software and the hardware implementations of DICache. The experimental results indicate that DICache can significantly remove the redundancy of redecoding operations, and this results in a dramatic decline of interpretation overhead.

[1]  Stephen John Turner,et al.  journal homepage: www.elsevier.com/locate/jpdc Synchronization in federation community networks , 2022 .

[2]  Stephen John Turner,et al.  Large scale agent-based simulation on the grid , 2008, Future Gener. Comput. Syst..

[3]  Scott A. Mahlke,et al.  The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.

[4]  Stephen John Turner,et al.  A decoupled federate architecture for high level architecture-based distributed simulation , 2008, J. Parallel Distributed Comput..

[5]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[6]  Avi Mendelson,et al.  Power awareness through selective dynamically optimized traces , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[7]  Li Shen,et al.  A Hardware Approach for Reducing Interpretation Overhead , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[8]  E. Duesterwald,et al.  Software profiling for hot path prediction: less is more , 2000, SIGP.

[9]  Hong Bao,et al.  GPGPU-Aided Ensemble Empirical-Mode Decomposition for EEG Analysis During Anesthesia , 2010, IEEE Transactions on Information Technology in Biomedicine.

[10]  Michael L. Scott,et al.  Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[11]  Vasanth Bala,et al.  Dynamo: a transparent dynamic optimization system , 2000, SIGP.

[12]  CaiWentong,et al.  Algorithms for HLA-based distributed simulation cloning , 2005 .

[13]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[14]  Raymond J. Hookway,et al.  DIGITAL FX!32: Combining Emulation and Binary Translation , 1997, Digit. Tech. J..

[15]  Stephen John Turner,et al.  Algorithms for HLA-based distributed simulation cloning , 2005, TOMC.

[16]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[17]  Li Shen,et al.  Using Pcache to Speedup Interpretation in Dynamic Binary Translation , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[18]  Erik R. Altman,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[19]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach (4. ed.) , 2007 .

[20]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[21]  Parthasarathy Ranganathan,et al.  MagiXen: Combining Binary Translation and Virtualization , 2007 .

[22]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[23]  Michael Gschwind,et al.  Dynamic Binary Translation and Optimization , 2001, IEEE Trans. Computers.

[24]  J.E. Smith Reducing Startup Time in Co-Designed Virtual Machines , 2006, ISCA 2006.

[25]  James E. Smith,et al.  Virtual machines - versatile platforms for systems and processes , 2005 .

[26]  Michael Lees,et al.  Data access in distributed simulations of multi-agent systems , 2008, J. Syst. Softw..

[27]  Cristina Cifuentes,et al.  Machine-adaptable dynamic binary translation , 2000, Dynamo.

[28]  Richard Johnson,et al.  The Transmeta Code Morphing#8482; Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, CGO.

[29]  Jeffrey Dean,et al.  ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[30]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[31]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[32]  Paul Klint,et al.  Interpretation Techniques , 1981, Softw. Pract. Exp..

[33]  Richard L. Sites,et al.  Binary translation , 1993, CACM.

[34]  Dirk Grunwald,et al.  Shadow Profiling: Hiding Instrumentation Costs with Parallelism , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[35]  Peter M. Kogge,et al.  An Architectural Trail to Threaded-Code Systems , 1982, Computer.

[36]  Cheng Wang,et al.  Software-based transparent and comprehensive control-flow error detection , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[37]  Stephen John Turner,et al.  Towards Fault-tolerant HLA-based Distributed Simulations , 2008, Simul..

[38]  James R. Bell,et al.  Threaded code , 1973, CACM.

[39]  Jong-Deok Choi,et al.  Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization , 2000 .

[40]  Cristina Cifuentes,et al.  Machine-adaptable dynamic binary translation , 2000 .

[41]  Cristina Cifuentes,et al.  Optimising hot paths in a dynamic binary translator , 2001, CARN.