Predicting unroll factors using supervised classification

Compilers base many critical decisions on abstracted architectural models. While recent research has shown that modeling is effective for some compiler problems, building accurate models requires a great deal of human time and effort. This paper describes how machine learning techniques can be leveraged to help compiler writers model complex systems. Because learning techniques can effectively make sense of high dimensional spaces, they can be a valuable tool for clarifying and discerning complex decision boundaries. In this work we focus on loop unrolling, a well-known optimization for exposing instruction level parallelism. Using the Open Research Compiler as a testbed, we demonstrate how one can use supervised learning techniques to determine the appropriateness of loop unrolling. We use more than 2,500 loops - drawn from 72 benchmarks - to train two different learning algorithms to predict unroll factors (i.e., the amount by which to unroll a loop) for any novel loop. The technique correctly predicts the unroll factor for 65% of the loops in our dataset, which leads to a 5% overall improvement for the SPEC 2000 benchmark suite (9% for the SPEC 2000 floating point benchmarks).

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[3]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[4]  John R. Ellis,et al.  Bulldog: A Compiler for VLIW Architectures , 1986 .

[5]  Gang Ren,et al.  A comparison of empirical and model-driven optimization , 2003, PLDI '03.

[6]  Yunheung Paek,et al.  Finding effective optimization phase sequences , 2003 .

[7]  Saman P. Amarasinghe,et al.  Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.

[8]  Saman P. Amarasinghe,et al.  Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.

[9]  John Cavazos,et al.  Inducing heuristics to decide whether to schedule , 2004, PLDI '04.

[10]  Steve Carr,et al.  Unroll-and-jam using uniformly generated sets , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[11]  Vivek Sarkar Optimized unrolling of nested loops , 2000, ICS '00.

[12]  J. Eliot B. Moss,et al.  Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts , 1998, NIPS.

[13]  Keith D. Cooper,et al.  Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.

[14]  Ken Kennedy,et al.  Improving the ratio of memory operations to floating-point operations in loops , 1994, TOPL.

[15]  Una-May O'Reilly,et al.  Adapting Convergent Scheduling Using Machine-Learning , 2003, LCPC.

[16]  Dirk Grunwald,et al.  Evidence-based static branch prediction using machine learning , 1997, TOPL.

[17]  Jack W. Davidson,et al.  Memory access coalescing: a technique for eliminating redundant memory accesses , 1994, PLDI '94.

[18]  Carla E. Brodley,et al.  Learning to Schedule Straight-Line Code , 1997, NIPS.

[19]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..