Exploiting SIMD Asymmetry in ARM-to-x86 Dynamic Binary Translation
暂无分享,去创建一个
[1] Brendan Dolan-Gavitt,et al. Repeatable Reverse Engineering with PANDA , 2015, PPREW@ACSAC.
[2] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[3] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[4] No License,et al. Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .
[5] Vasanth Bala,et al. Dynamo: a transparent dynamic optimization system , 2000, SIGP.
[6] Avinash Sodani,et al. Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition , 2016 .
[7] J. Dongarra,et al. HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems∗ , 2015 .
[8] Wei-Chung Hsu,et al. Exploiting Longer SIMD Lanes in Dynamic Binary Translation , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).
[9] Philippe Clauss,et al. Runtime Vectorization Transformations of Binary Code , 2017, International Journal of Parallel Programming.
[10] Cheng Wang,et al. StarDBT: An Efficient Multi-platform Dynamic Binary Translation System , 2007, Asia-Pacific Computer Systems Architecture Conference.
[11] Timothy M. Jones,et al. PSLP: Padded SLP automatic vectorization , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[12] Mateo Valero,et al. Speculative dynamic vectorization , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[13] Gagan Agrawal,et al. An execution strategy and optimized runtime support for parallelizing irregular reductions on modern GPUs , 2011, ICS '11.
[14] Kunle Olukotun,et al. Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[15] Richard Johnson,et al. The Transmeta Code Morphing#8482; Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, CGO.
[16] Wei-Chung Hsu,et al. SIMD Code Translation in an Enhanced HQEMU , 2015, 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS).
[17] Fabrice Bellard,et al. QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.
[18] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[19] Yun Wang,et al. IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium/spl reg/-based systems , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[20] Jim Jeffers,et al. Knights Landing overview , 2016 .
[21] Albert Cohen,et al. Vapor SIMD: Auto-vectorize once, run everywhere , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[22] Wei-Chung Hsu,et al. Exploiting Asymmetric SIMD Register Configurations in ARM-to-x86 Dynamic Binary Translation , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[23] Richard Johnson,et al. The Transmeta Code Morphing/spl trade/ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[24] Mary Alexandra Agner. Bristled wings could provide a propulsive punch for future micro air vehicles , 2018, Scilight.
[25] Hao Zhou,et al. Exploiting mixed SIMD parallelism by reducing data reorganization overhead , 2016, 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[26] Wei-Chung Hsu,et al. Design and Implementation of a Lightweight Dynamic Optimization System , 2004, J. Instr. Level Parallelism.
[27] Scott A. Mahlke,et al. Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[28] Chien-Min Wang,et al. HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores , 2012, CGO '12.
[29] Philippe Clauss,et al. Dynamic re-vectorization of binary code , 2015, 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).
[30] Frédéric Pétrot,et al. Speeding-up SIMD instructions dynamic binary translation in embedded processor simulation , 2011, 2011 Design, Automation & Test in Europe.
[31] Jaewook Shin,et al. Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.
[32] Yun Wang,et al. IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium-based systems , 2003, MICRO.
[33] David A. Padua,et al. An Evaluation of Vectorizing Compilers , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[34] Bo Huang,et al. Optimizing dynamic binary translation for SIMD instructions , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[35] Seonggun Kim,et al. Efficient SIMD code generation for irregular kernels , 2012, PPoPP '12.
[36] Saman P. Amarasinghe,et al. Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.
[37] Nicholas Nethercote,et al. Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.
[38] John Yates,et al. FX!32 a profile-directed binary translator , 1998, IEEE Micro.
[39] Hao Zhou,et al. A Compiler Approach for Exploiting Partial SIMD Parallelism , 2016, ACM Trans. Archit. Code Optim..