Predicting Unroll Factors Using Nearest Neighbors

In order to deliver the promise of Moore’s Law to the end user, compilers must make decisions that are intimately tied to a specific target architecture. As engineers add architectural features to increase performance, systems become harder to model, and thus, it becomes harder for a compiler to make effective decisions. Machine-learning techniques may be able to help compiler writers model modern architectures. Because learning techniques can effectively make sense of high dimensional spaces, they can be a valuable tool for clarifying and discerning complex decision boundaries. In our work we focus on loop unrolling, a well-known optimization for exposing instruction level parallelism. Using the Open Research Compiler as a testbed, we demonstrate how one can use supervised learning techniques to model the appropriateness of loop unrolling. We use more than 1,100 loops — drawn from 46 benchmarks — to train a simple learning algorithm to recognize when loop unrolling is advantageous. The resulting classifier can predict with 88% accuracy whether a novel loop (i.e., one that was not in the training set) benefits from loop unrolling. Furthermore, we can predict the optimal or nearly optimal unroll factor 74% of the time. We evaluate the ramifications of these prediction accuracies using the Open Research Compiler (ORC) and the Itanium r © 2 architecture. The learned classifier yields a 6% speedup (over ORC’s unrolling heuristic) for SPEC benchmarks, and a 7% speedup on the remainder of our benchmarks. Because the learning techniques we employ run very quickly, we were able to exhaustively determine the four most salient loop characteristics for determining when unrolling is beneficial.