Predicting Good Compiler Transformations Using Machine Learning

This dissertation presents a machine learning solution to the compiler optimisation problem focused on a particular program transformation: loop unrolling. Loop unrolling is a very straightforward but powerful code transformation mainly used to improve Instruction Level Parallelism and to reduce the overhead due to loop control. However, loop unrolling can also be detrimental, for example, when the instruction cache is degraded due to the size of the loop body. Additionally, the effect of the interactions between loop unrolling and other program transformations is unknown. Consequently, determining when and how unrolling should be applied remains a challenge for compiler writers and researchers. This project works under the assumption that the effect of loop unrolling on the execution times of programs can be learnt based on past examples. Therefore, a regression approach able to learn the improvement in performance of loops under unrolling is presented. This novel approach differs from previous work ([Monsifrot et al., 2002] and [Stephenson and Amarasinghe, 2004]) because it does not formulate the problem as a classification task but as a regression solution. Great effort has been invested in the generation of clean and reliable data in order to make it suitable for learning. Two different regression algorithms have been used: Multiple Linear Regression and Classification and Regression Trees (CART). Although the accuracy of the methods is questionable, the realisation of final speed-ups on seven out of twelve benchmarks indicates that something has been gained with this learning process. A maximum 18% of re-substitution improvement has been achieved. 2.5% of overall improvement in performance for Linear Regression and 2.3% for CART algorithm have been obtained. The present work is the beginning of an ambitious project that attempts to build a compiler that can learn to optimise programs and can undoubtedly be improved in the near future.

[1]  Saman P. Amarasinghe,et al.  Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.

[2]  François Bodin,et al.  Computer aided hand tuning (CAHT): “applying case-based reasoning to performance tuning” , 2001, ICS '01.

[3]  Shun Long Adaptive Java optimisation using machine learning techniques , 2004 .

[4]  Mark Stephenson,et al.  Predicting Unroll Factors Using Nearest Neighbors , 2004 .

[5]  Dirk Grunwald,et al.  Evidence-based static branch prediction using machine learning , 1997, TOPL.

[6]  Jack J. Dongarra,et al.  Unrolling loops in fortran , 1979, Softw. Pract. Exp..

[7]  R. L. Winkler,et al.  Statistics : Probability, Inference and Decision , 1975 .

[8]  François Bodin,et al.  A user level program transformation tool , 1998, ICS '98.

[9]  W. D. Ray Applied Linear Statistical Models (3rd Edition) , 1991 .

[10]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[11]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[12]  Jack W. Davidson,et al.  An Aggressive Approach to Loop Unrolling , 2001 .

[13]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[14]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[15]  Grigori Fursin,et al.  Iterative compilation and performance prediction for numerical applications , 2004 .

[16]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[17]  Scott A. Mahlke,et al.  Trimaran: An Infrastructure for Research in Instruction-Level Parallelism , 2004, LCPC.

[18]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[19]  Keith D. Cooper,et al.  Engineering a Compiler , 2003 .