Simulation-based code duplication for enhancing compiler optimizations

The scope of compiler optimizations is often limited by control flow, which prohibits optimizations across basic block boundaries. Code duplication can solve this problem by extending basic block sizes, thus enabling subsequent optimizations. However, duplicating code for every optimization opportunity may lead to excessive code growth. Therefore, a holistic approach is required that is capable of finding optimization opportunities and classifying their impact. This paper presents a novel approach to determine which code should be duplicated in order to improve peak performance. The approach analyzes duplication candidates for subsequent optimizations opportunities. It does so by simulating a duplication and analyzing its impact on other optimizations. This allows a compiler to weight up multiple success metrics in order to choose those duplications with the maximum optimization potential. We further show how to map code duplication opportunities to an optimization cost model that allows us to maximize performance while minimizing code size increase.

[1]  Christian Wimmer,et al.  Self-optimizing AST interpreters , 2012, DLS.

[2]  David B. Whalley,et al.  Avoiding conditional branches by code replication , 1995, PLDI '95.

[3]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[4]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[5]  Yoshihiko Futamura,et al.  Partial Evaluation of Computation Process--An Approach to a Compiler-Compiler , 1999, High. Order Symb. Comput..

[6]  Toshiaki Yasue,et al.  A study of devirtualization techniques for a Java Just-In-Time compiler , 2000, OOPSLA '00.

[7]  Scott A. Mahlke,et al.  Using profile information to assist classic code optimizations , 1991, Softw. Pract. Exp..

[8]  Hanspeter Mössenböck,et al.  An intermediate representation for speculative optimizations in a dynamic compiler , 2013, VMIL '13.

[9]  David B. Whalley,et al.  Avoiding unconditional jumps by code replication , 1992, PLDI '92.

[10]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[11]  Christian Wimmer,et al.  Practical partial evaluation for high-performance dynamic language runtimes , 2017, PLDI.

[12]  Scott A. Mahlke,et al.  The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.

[13]  K. McKinley,et al.  Compiler-Based Code-Improvement Techniques , 2001 .

[14]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[15]  Rajiv Gupta,et al.  Complete removal of redundant expressions , 1998, PLDI 1998.

[16]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[17]  Keith D. Cooper,et al.  Combining analyses, combining optimizations , 1995, TOPL.

[18]  Hanspeter Mössenböck,et al.  Dominance-based duplication simulation (DBDS): code duplication to enable compiler optimizations , 2018, CGO.

[19]  Christian Wimmer,et al.  One VM to rule them all , 2013, Onward!.

[20]  Craig Chambers,et al.  The design and implementation of the self compiler, an optimizing compiler for object-oriented programming languages , 1992 .

[21]  Hanspeter Mössenböck,et al.  Graal IR : An Extensible Declarative Intermediate Representation , 2013 .

[22]  Etienne Morel,et al.  Global optimization by suppression of partial redundancies , 1979, CACM.

[23]  J. E. Ball,et al.  Predicting the effects of optimization on a procedure body , 1979, SIGPLAN '79.

[24]  David C. Hoaglin,et al.  Some Implementations of the Boxplot , 1989 .

[25]  Mira Mezini,et al.  Da capo con scala: design and analysis of a scala benchmark suite for the java virtual machine , 2011, OOPSLA '11.

[26]  Hanspeter Mössenböck,et al.  Java-to-JavaScript translation via structured control flow reconstruction of compiler IR , 2015, DLS.