Fast-path loop unrolling of non-counted loops to enable subsequent compiler optimizations
暂无分享,去创建一个
Hanspeter Mössenböck | Thomas Würthinger | Lukas Stadler | Roland Schatz | David Leopoldseder | Manuel Rigger | Manuel Rigger | H. Mössenböck | Roland Schatz | Thomas Würthinger | David Leopoldseder | Lukas Stadler
[1] Matthias Hauswirth,et al. Use at your own risk: the Java unsafe API in the wild , 2015, OOPSLA.
[2] Thomas Würthinger,et al. Making collection operations optimal with aggressive JIT compilation , 2017, SCALA@SPLASH.
[3] Andreas Krall,et al. Compilation Techniques for Multimedia Processors , 2004, International Journal of Parallel Programming.
[4] Sanjay Jinturkar,et al. Aggressive Loop Unrolling in a Retargetable Optimizing Compiler , 1996, CC.
[5] Vivek Sarkar. Optimized unrolling of nested loops , 2000, ICS '00.
[6] Hanspeter Mössenböck,et al. An object storage model for the truffle language implementation framework , 2014, PPPJ '14.
[7] Christopher A. Vick,et al. The Java HotSpotTM Server Compiler , 2001 .
[8] J. C. Huang,et al. Generalized loop-unrolling: a method for program speedup , 1999, Proceedings 1999 IEEE Symposium on Application-Specific Systems and Software Engineering and Technology. ASSET'99 (Cat. No.PR00122).
[9] Hanspeter Mössenböck,et al. The taming of the shrew: increasing performance by automatic parameter tuning for java garbage collectors , 2014, ICPE.
[10] Vicki H. Allan,et al. Software pipelining , 1995, CSUR.
[11] Jing Wang,et al. Loop-carried dependence and the general URPR software pipelining approach (unrolling, pipelining and rerolling) , 1991, Proceedings of the Twenty-Fourth Annual Hawaii International Conference on System Sciences.
[12] Cliff Click,et al. Global code motion/global value numbering , 1995, PLDI '95.
[13] Prasad A. Kulkarni,et al. AOT vs. JIT: impact of profile data on code quality , 2017, LCTES.
[14] Sharad Malik,et al. Performance estimation of embedded software with instruction cache modeling , 1999, TODE.
[15] Stamatis Vassiliadis,et al. Instruction-level parallel processors , 1995 .
[16] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[17] Roy Dz-Ching Ju,et al. A compiler framework for speculative analysis and optimizations , 2003, PLDI '03.
[18] Hanspeter Mössenböck,et al. Partial Escape Analysis and Scalar Replacement for Java , 2014, CGO '14.
[19] Andreu Carminati,et al. Combining loop unrolling strategies and code predication to reduce the worst-case execution time of real-time software , 2017 .
[20] Jack W. Davidson,et al. An Aggressive Approach to Loop Unrolling , 2001 .
[21] Amer Diwan,et al. The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.
[22] Hanspeter Mössenböck,et al. Dominance-based duplication simulation (DBDS): code duplication to enable compiler optimizations , 2018, CGO.
[23] Michael Wolfe,et al. Beyond induction variables , 1992, PLDI '92.
[24] Andreas Schörgenhumer,et al. Efficient Tracing and Versatile Analysis of Lock Contention in Java Applications on the Virtual Machine Level , 2016, ICPE.
[25] Alon Zakai,et al. Bringing the web up to speed with WebAssembly , 2017, PLDI.
[26] Alan E. Charlesworth,et al. An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family , 1981, Computer.
[27] Amer Diwan,et al. Type-based alias analysis , 1998, PLDI.
[28] Frank Yellin,et al. The Java Virtual Machine Specification , 1996 .
[29] Thomas Würthinger. Dynamic code evolution for Java , 2010, PPPJ.
[30] Yoshihiko Futamura,et al. Partial Evaluation of Computation Process--An Approach to a Compiler-Compiler , 1999, High. Order Symb. Comput..
[31] Alon Zakai. Emscripten: an LLVM-to-JavaScript compiler , 2011, OOPSLA Companion.
[32] Christian Wimmer,et al. One VM to rule them all , 2013, Onward!.
[33] Steven S. Muchnick,et al. Efficient instruction scheduling for a pipelined architecture , 1986, SIGPLAN '86.
[34] Jack W. Davidson,et al. Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation , 1995, MICRO.
[35] Yi Lin,et al. Stop and go: understanding yieldpoint behavior , 2015, ISMM.
[36] Mira Mezini,et al. Da capo con scala: design and analysis of a scala benchmark suite for the java virtual machine , 2011, OOPSLA '11.
[37] David C. Hoaglin,et al. Some Implementations of the Boxplot , 1989 .
[38] Ahmed El-Mahdy,et al. Automatic Vectorization Using Dynamic Compilation and Tree Pattern Matching Technique in Jikes RVM , 2009 .
[39] Christian Wimmer,et al. Self-optimizing AST interpreters , 2012, DLS.
[40] Hanspeter Mössenböck,et al. Graal IR : An Extensible Declarative Intermediate Representation , 2013 .
[41] Mark N. Wegman,et al. Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.
[42] Philip H. Sweany,et al. Improving software pipelining with unroll-and-jam , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.
[43] Hanspeter Mössenböck,et al. An experimental study of the influence of dynamic compiler optimizations on Scala performance , 2013, SCALA@ECOOP.
[44] Christian Wimmer,et al. Practical partial evaluation for high-performance dynamic language runtimes , 2017, PLDI.
[45] Scott A. Mahlke,et al. The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.