Rapidly Selecting Good Compiler Optimizations using Performance Counters

Applying the right compiler optimizations to a particular program can have a significant impact on program performance. Due to the non-linear interaction of compiler optimizations, however, determining the best setting is non-trivial. There have been several proposed techniques that search the space of compiler options to find good solutions; however such approaches can be expensive. This paper proposes a different approach using performance counters as a means of determining good compiler optimization settings. This is achieved by learning a model off-line which can then be used to determine good settings for any new program. We show that such an approach outperforms the state-of-the-art and is two orders of magnitude faster on average. Furthermore, we show that our performance counter-based approach outperforms techniques based on static code features. Using our technique we achieve a 17% improvement over the highest optimization setting of the commercial PathScale EKOPath 2.3.1 optimizing compiler on the SPEC benchmark suite on a AMD Athlon 64 3700+ platform

[1]  Robert J. McEliece,et al.  The Theory of Information and Coding , 1979 .

[2]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[3]  Jeffrey Dean,et al.  ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[4]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[5]  Keith D. Cooper,et al.  Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.

[6]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[7]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[8]  David I. August,et al.  Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[9]  Kerstin Eder,et al.  International Symposium on Code Generation and Optimization. CGO 2003 , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[10]  Gang Ren,et al.  A comparison of empirical and model-driven optimization , 2003, PLDI '03.

[11]  Saman P. Amarasinghe,et al.  Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.

[12]  M. Naderi Think globally... , 2004, HIV prevention plus!.

[13]  Michael F. P. O'Boyle,et al.  Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation , 2004, The Journal of Supercomputing.

[14]  Douglas L. Jones,et al.  Fast searches for effective optimization phase sequences , 2004, PLDI '04.

[15]  John Cavazos,et al.  Inducing heuristics to decide whether to schedule , 2004, PLDI '04.

[16]  Keith D. Cooper,et al.  Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.

[17]  James Demmel,et al.  Statistical Models for Empirical Search-Based Performance Tuning , 2004, Int. J. High Perform. Comput. Appl..

[18]  David Parello,et al.  Towards a Systematic, Pragmatic and Architecture-Aware Program Optimization Process for Complex Processors , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[19]  Franz Franchetti,et al.  SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.

[20]  Michael Stumm,et al.  Online performance analysis by statistical sampling of microprocessor performance counters , 2005, ICS '05.

[21]  Grigori Fursin,et al.  Probabilistic source-level optimisation of embedded programs , 2005, LCTES '05.

[22]  Michael F. P. O'Boyle,et al.  Probabilistic source-level optimisation of embedded programs , 2005, LCTES.

[23]  Mark Stephenson,et al.  Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.

[24]  Peter M. W. Knijnenburg,et al.  Automatic selection of compiler options using non-parametric inferential statistics , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[25]  Andrei V. Kelarev,et al.  The Theory of Information and Coding , 2005 .

[26]  Keshav Pingali,et al.  Think globally, search locally , 2005, ICS '05.

[27]  Keith D. Cooper,et al.  ACME: adaptive compilation made efficient , 2005, LCTES '05.

[28]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[29]  Mary Lou Soffa,et al.  A model-based framework: an approach for profit-driven optimization , 2005, International Symposium on Code Generation and Optimization.

[30]  Rudolf Eigenmann,et al.  Fast, automatic, procedure-level performance tuning , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[31]  Michael F. P. O'Boyle,et al.  Method-specific dynamic compilation using logistic regression , 2006, OOPSLA '06.

[32]  Michael F. P. O'Boyle,et al.  Using machine learning to focus iterative optimization , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[33]  Rudolf Eigenmann,et al.  Fast and effective orchestration of compiler optimizations for automatic performance tuning , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[34]  Gary S. Tyson,et al.  Exhaustive optimization phase order space exploration , 2006, International Symposium on Code Generation and Optimization (CGO'06).