Partitioning the Conventional DBT System for Multiprocessors

Noticeable performance improvement via ever-increasing transistors is gradually trapped into a predicament since software cannot logically and efficiently utilize hardware resource, such as multi-core resource. This is an inevitable problem in dynamic binary translation (DBT) system as well. Though special purpose hardware as aide tool, through some interfaces, provided by DBT enables the system to achieve higher performance, the limitation of it is significant, that is, it is impossible to be used widely by another one. To overcome this drawback, we focus on building compatible software architecture to acquire higher performance without platform dependence. In this paper, we propose a novel multithreaded architecture for DBT system through partitioning distinct function module, which is to adequately utilize multiprocessors resource. This new architecture devides couples the common DBT system (DBTs) working routine into dynamic translation, optimization, and translated code execution phases, and then ramifies them into different threads to enable them concurrently executed. In this new architecture, several efficient novel methods are presented to cope with intractable work that puzzles most researchers, such as communication mechanism, cache layout, and mutual exclusion between threads. Experimental results using SPECint 2000 indicate that this new architecture for DBT system can achieve higher performance — speed up the traditional DBT system by about average 10.75%, with better CPU utilization.

[1]  Weifeng Zhang,et al.  An event-driven multithreaded dynamic optimization framework , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[2]  Erik R. Altman,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[3]  Wei Zhang,et al.  Improving Java performance and energy dissipation through efficient code caching , 2009, Des. Autom. Embed. Syst..

[4]  Goh Kondoh,et al.  Dynamic binary translation specialized for embedded systems , 2010, VEE '10.

[5]  Erik R. Altman,et al.  BOA: The Architecture of a Binary Translation Processor , 1999 .

[6]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[7]  Haibing Guan,et al.  A hardware / software co-designed virtual machine to support multiple ISAs , 2008 .

[8]  Koushik Chakraborty,et al.  Dynamic heterogeneity and the need for multicore virtualization , 2009, OPSR.

[9]  Michael D. Smith,et al.  Managing bounded code caches in dynamic binary optimization systems , 2006, TACO.

[10]  Mary Lou Soffa,et al.  Retargetable and reconfigurable software dynamic translation , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[11]  William Stallings,et al.  Operating Systems: Internals and Design Principles , 1991 .

[12]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[13]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[14]  Sorin Lerner,et al.  Mojo: A Dynamic Optimization System , 2000 .

[15]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[16]  Yun Wang,et al.  IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[17]  Jack W. Davidson,et al.  Addressing the challenges of DBT for the ARM architecture , 2009, LCTES '09.

[18]  A. Klaiber The Technology Behind Crusoe TM Processors Low-power x 86-Compatible Processors Implemented with Code Morphing , 2000 .

[19]  Yun Wang,et al.  IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium-based systems , 2003, MICRO.

[20]  Thomas R. Gross,et al.  Fast Binary Translation : Translation Efficiency and Runtime Efficiency , 2009 .

[21]  Cristina Cifuentes,et al.  Machine-adaptable dynamic binary translation , 2000 .

[22]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[23]  Wei Hu,et al.  Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems , 2007, CGO.

[24]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[25]  Cristina Cifuentes,et al.  Walkabout: a retargetable dynamic binary translation framework , 2002 .

[26]  Evelyn Duesterwald,et al.  Design and implementation of a dynamic optimization framework for windows , 2000 .

[27]  Margaret Martonosi,et al.  A dynamic compilation framework for controlling microprocessor energy and performance , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[28]  Bruce R. Childers,et al.  Heterogeneous code cache: Using scratchpad and main memory in dynamic binary translators , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[29]  Mike Van,et al.  UQBT: Adaptable Binary Translation at Low Cost , 2000 .

[30]  Cheng Wang,et al.  Software-based transparent and comprehensive control-flow error detection , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[31]  Kim M. Hazelwood,et al.  Scalable support for multithreaded applications on dynamic binary instrumentation systems , 2009, ISMM '09.

[32]  John Yates,et al.  FX!32 a profile-directed binary translator , 1998, IEEE Micro.

[33]  胡伟武,et al.  High Performance General-Purpose Microprocessors: Past and Future , 2006 .

[34]  Wei-Chung Hsu,et al.  Design and Implementation of a Lightweight Dynamic Optimization System , 2004, J. Instr. Level Parallelism.

[35]  Cheng Wang,et al.  Supporting Legacy Binary Code in a Software Transaction Compiler with Dynamic Binary Translation and Optimization , 2008, CC.

[36]  Michael D. Smith,et al.  Code cache management schemes for dynamic optimizers , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

[37]  Jonathan S. Shapiro,et al.  HDTrans: a low-overhead dynamic translator , 2007, CARN.

[38]  Michael D. Smith,et al.  Code cache management in dynamic optimization systems , 2004 .

[39]  Yi Wang,et al.  An intermediate language level optimization framework for dynamic binary translation , 2007, SIGP.