Acceldroid: Co-designed acceleration of Android bytecode

A hardware/software co-designed processor transparently supports a ubiquitous ISA (e.g. ×86) with diversified and innovative microarchitectural implementations. It leverages co-designed HW features and dynamic binary translation (DBT) SW to morph existing binary programs to scale performance and save power. On such systems, the portable bytecode of modern dynamic languages (e.g. Java, JavaScript, etc.) is first translated into the code in the architecture ISA by the just-in-time (JIT) compilation in the bytecode virtual machine, and then into the code in the internal implementation ISA by the DBT. This not only incurs the translation overheads twice, but also brings significant emulation inefficiency as the DBT does not have the high level bytecode information. In this paper, we present AccelDroid, which accelerates the Android Dalvik bytecode execution on the HW/SW co-designed processor through direct bytecode translation in the DBT. Our experiments on a HW/SW co-designed Transmeta Efficeon machine show that AccelDroid can improve performance by 78% and save energy by 40% for the CaffeineMark 3.0 benchmark suite.

[1]  Sanjay J. Patel,et al.  Increasing the size of atomic instruction blocks using control flow assertions , 2000, MICRO 33.

[2]  Craig B. Zilles,et al.  A real system evaluation of hardware atomicity for software speculation , 2010, ASPLOS XV.

[3]  Erik R. Altman,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[4]  Derek Bruening,et al.  An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[5]  Manish Gupta,et al.  Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.

[6]  Cheng Wang,et al.  Modeling and Performance Evaluation of TSO-Preserving Binary Optimization , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[7]  Scott A. Mahlke,et al.  The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.

[8]  Richard Johnson,et al.  The Transmeta Code Morphing#8482; Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, CGO.

[9]  Richard Johnson,et al.  The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[10]  James E. Smith,et al.  Hardware Support for Control Transfers in Code Caches , 2003, MICRO.

[11]  Yun Wang,et al.  IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium-based systems , 2003, MICRO.

[12]  K. Ebcioglu,et al.  Daisy: Dynamic Compilation For 10o?40 Architectural Compatibility , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[13]  Yun Wang,et al.  IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[14]  Craig B. Zilles,et al.  Hardware atomicity for reliable software speculation , 2007, ISCA '07.

[15]  L. Peter Deutsch,et al.  Efficient implementation of the smalltalk-80 system , 1984, POPL.

[16]  Jonathan S. Shapiro,et al.  HDTrans: an open source, low-level dynamic instrumentation system , 2006, VEE '06.

[17]  Toshiaki Yasue,et al.  A dynamic optimization framework for a Java just-in-time compiler , 2001, OOPSLA '01.

[18]  Vasanth Bala,et al.  Dynamo: a transparent dynamic optimization system , 2000, SIGP.

[19]  Margaret Martonosi,et al.  Improving prediction for procedure returns with return-address-stack repair mechanisms , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[20]  Cheng Wang,et al.  Selective Runtime Memory Disambiguation in a Dynamic Binary Translator , 2006, CC.

[21]  James E. Smith,et al.  Virtual machines - versatile platforms for systems and processes , 2005 .

[22]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.