Meta optimization: improving compiler heuristics with machine learning

Compiler writers have crafted many heuristics over the years to approximately solve NP-hard problems efficiently. Finding a heuristic that performs well on a broad range of applications is a tedious and difficult process. This paper introduces Meta Optimization, a methodology for automatically fine-tuning compiler heuristics. Meta Optimization uses machine-learning techniques to automatically search the space of compiler heuristics. Our techniques reduce compiler design complexity by relieving compiler writers of the tedium of heuristic tuning. Our machine-learning system uses an evolutionary algorithm to automatically find effective compiler heuristics. We present promising experimental results. In one mode of operation Meta Optimization creates application-specific heuristics which often result in impressive speedups. For hyperblock formation, one optimization we present in this paper, we obtain an average speedup of 23% (up to 73%) for the applications in our suite. Furthermore, by evolving a compiler's heuristic over several benchmarks, we can create effective, general-purpose heuristics. The best general-purpose heuristic our system found for hyperblock formation improved performance by an average of 25% on our training set, and 9% on a completely unrelated test set. We demonstrate the efficacy of our techniques on three different optimizations in this paper: hyperblock formation, register allocation, and data prefetching.

[1]  Steven S. Muchnick,et al.  Efficient instruction scheduling for a pipelined architecture , 1986, SIGPLAN '86.

[2]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[3]  Keith D. Cooper,et al.  Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.

[4]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[5]  Thomas M. Conte,et al.  Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[6]  Ron Y. Pinter,et al.  Spill code minimization techniques for optimizing compliers , 1989, PLDI '89.

[7]  Scott Mahlke,et al.  Exploiting Instruction Level Parallelism in the Presence of Conditional Branches , 1997 .

[8]  Gary William Grewal,et al.  Mapping reference code to irregular DSPs within the retargetable, optimizing compiler COGEN(T) , 2001, MICRO.

[9]  Scott A. Mahlke,et al.  The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors , 1995, IEEE Trans. Computers.

[10]  Dirk Grunwald,et al.  Evidence-based static branch prediction using machine learning , 1997, TOPL.

[11]  Vinod Kathail,et al.  Meld scheduling: relaxing scheduling constraints across region boundaries , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[12]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[13]  Nancy J. Warter,et al.  Modulo scheduling with isomorphic control transformations , 1993 .

[14]  Thomas Haynes,et al.  Depth-fair crossover in genetic programming , 1999, SAC '99.

[15]  T. C. Wilson,et al.  Mapping reference code to irregular DSPs within the retargetable, optimizing compiler COGEN(T) , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[16]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[17]  M. Schlansker,et al.  On Predicated Execution , 1991 .

[18]  Scott Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.

[19]  Todd C. Mowry,et al.  Tolerating latency through software-controlled data prefetching , 1994 .

[20]  Christopher Gathercole,et al.  An investigation of supervised learning in genetic programming , 1998 .

[21]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[22]  John L. Hennessy,et al.  The Priority-Based Register Allocation , 1990 .