Compiler optimization-space exploration

To meet the demands of modern architectures, optimizing compilers must incorporate an ever larger number of increasingly complex transformation algorithms. Since code transformations may often degrade performance or interfere with subsequent transformations, compilers employ predictive heuristics to guide optimizations by predicting their effects a priori. Unfortunately, the unpredictability of optimization interaction and the irregularity of today's wide-issue machines severely limit the accuracy of these heuristics. As a result, compiler writers may temper high variance optimizations with overly conservative heuristics or may exclude these optimizations entirely. While this process results in a compiler capable of generating good average code quality across the target benchmark set, it is at the cost of missed optimization opportunities in individual code segments. To replace predictive heuristics, researchers have proposed compilers which explore many optimization options, selecting the best one a posteriori. Unfortunately, these existing iterative compilation techniques are not practical for reasons of compile time and applicability. We present the Optimization-Space Exploration (OSE) compiler organization, the first practical iterative compilation strategy applicable to optimizations in general-purpose compilers. Instead of replacing predictive heuristics, OSE uses the compiler writer's knowledge encoded in the heuristics to select a small number of promising optimization alternatives for a given code segment. Compile time is limited by evaluating only these alternatives for hot code segments using a general compile-time performance estimator An OSE-enhanced version of Intel's highly-tuned, aggressively optimizing production compiler for IA-64 yields a significant performance improvement, more than 20% in some cases, on Itanium for SPEC codes.

[1]  E. Granston,et al.  Automatic Recommendation of Compiler Options , 2001 .

[2]  Michael F. P. O'Boyle,et al.  A Feasibility Study in Iterative Compilation , 1999, ISHPC.

[3]  Michael Franz,et al.  Continuous Program Optimization: Design and Evaluation , 2001, IEEE Trans. Computers.

[4]  Michael E. Wolf,et al.  Combining Loop Transformations Considering Caches and Scheduling , 2004, International Journal of Parallel Programming.

[5]  Dennis Gannon,et al.  Performance evaluation and prediction for parallel algorithms on the BBN GP1000 , 1990, ICS '90.

[6]  Guang R. Gao,et al.  Minimizing register requirements under resource-constrained rate-optimal software pipelining , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[7]  Peter M. W. Knijnenburg,et al.  Iterative compilation in a non-linear optimisation space , 1998 .

[8]  Andy Nisbet,et al.  GAPS: Iterative Feedback Directed Parallelisation Using Genetic Algorithms , 2000 .

[9]  Susan J. Eggers,et al.  Integrating register allocation and instruction scheduling for RISCs , 1991, ASPLOS IV.

[10]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[11]  Michael F. P. O'Boyle,et al.  OCEANS: Optimizing Compilers for Embedded Applications , 1997, Euro-Par.

[12]  Sharad Malik,et al.  Performance estimation of embedded software with instruction cache modeling , 1995, ICCAD.

[13]  Keith D. Cooper,et al.  Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.

[14]  Mary Lou Soffa,et al.  An approach for exploring code improving transformations , 1997, TOPL.

[15]  W. G. Morris,et al.  CCG: a prototype coagulating code generator , 1991, PLDI '91.

[16]  Josep Llosa,et al.  Modulo Scheduling with Reduced Register Pressure , 1998, IEEE Trans. Computers.

[17]  Scott A. Mahlke,et al.  A framework for balancing control flow and predication , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[18]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[19]  Michael F. P. O'Boyle,et al.  Evaluating Iterative Compilation , 2002, LCPC.

[20]  Rainer Leupers,et al.  Instruction scheduling for clustered VLIW DSPs , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).

[21]  Scott A. Mahlke,et al.  Integrated predicated and speculative execution in the IMPACT EPIC architecture , 1998, ISCA.

[22]  Una-May O'Reilly,et al.  Genetic Programming Applied to Compiler Heuristic Optimization , 2003, EuroGP.