Fine-Grained Overhead Analysis Utilizing Atomic Instructions for Cross-ISA Dynamic Binary Translation on Multicore Processor

Modern software processing often involving other apps while running active apps to fulfill task requirements, which have caused the increment of program processing time inside heterogeneous multicore system-on-chip (SoC) processor. For available core usage efficiency improvement, concurrent compilation techniques has been applied into mix modes of statically and dynamically Dynamic Binary Translation and Optimisation (DBTO) process, to better service the combined applications processing. This research deep dived into finer-grained DBTO overhead analysis, to provide categorization and characterization of overhead sources in breakdown stages during concurrent instruction processing. A dual-engine of translation and optimization architecture is constructed for finer management of start-up overheads. Helper functions, i.e. LoadLink/StoreCondition (LL/SC) are derived from atomic instructions, to create multiple helper thread supported by multiple host cores, for better instruction translation and optimization operation concurrently. Our experiment platform, evaluated through PARSEC-3.0 benchmark suite, showed performance improvement approaching 2.0x for apps based programs and 1.25x for kernel based programs, for x86 to X86-64 emulation. This technique explore performance beyond hardware and software only limitations, and possess great potential for future parallel program processing improvement.

[1]  Yeh-Ching Chung,et al.  PQEMU: A Parallel System Emulator Based on QEMU , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[2]  Yun Wang,et al.  IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium-based systems , 2003, MICRO.

[3]  Gang-Ryung Uh,et al.  Analyzing Dynamic Binary Instrumentation Overhead , 2007 .

[4]  Wuu Yang,et al.  A Static Binary Translator for Efficient Migration of ARM based Applications , 2008 .

[5]  Jonathan S. Shapiro,et al.  HDTrans: an open source, low-level dynamic instrumentation system , 2006, VEE '06.

[6]  Nordin Zakaria,et al.  Dual-Engine Cross-ISA DBTO Technique Utilising MultiThreaded Support for Multicore Processor System , 2016, 2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC).

[7]  Liu,et al.  Dynamically Translating Binary Code for Multi-Threaded Programs Using Shared Code Cache , 2014 .

[8]  Haibo Chen,et al.  COREMU: a scalable and portable parallel full-system emulator , 2011, PPoPP '11.

[9]  Wuu Yang,et al.  An LLVM-based hybrid binary translation system , 2012, 7th IEEE International Symposium on Industrial Embedded Systems (SIES'12).

[10]  Yun Wang,et al.  IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[11]  Richard Johnson,et al.  Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization , 2003 .

[12]  Wei-Chung Hsu,et al.  Design and Implementation of a Lightweight Dynamic Optimization System , 2004, J. Instr. Level Parallelism.

[13]  Mary Lou Soffa,et al.  Overhead reduction techniques for software dynamic translation , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[14]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[15]  Edson Borin,et al.  Characterization of DBT overhead , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[16]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX Annual Technical Conference, FREENIX Track.

[17]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[18]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[19]  Goh Kondoh,et al.  Dynamic binary translation specialized for embedded systems , 2010, VEE '10.

[20]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..