MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning

Recent compilers offer a vast number of multilayered optimizations targeting different code segments of an application. Choosing among these optimizations can significantly impact the performance of the code being optimized. The selection of the right set of compiler optimizations for a particular code segment is a very hard problem, but finding the best ordering of these optimizations adds further complexity. Finding the best ordering represents a long standing problem in compilation research, named the phase-ordering problem. The traditional approach of constructing compiler heuristics to solve this problem simply cannot cope with the enormous complexity of choosing the right ordering of optimizations for every code segment in an application. This article proposes an automatic optimization framework we call MiCOMP, which <u>Mi</u>tigates the <u>Com</u>piler <u>P</u>hase-ordering problem. We perform phase ordering of the optimizations in LLVM’s highest optimization level using optimization sub-sequences and machine learning. The idea is to cluster the optimization passes of LLVM’s O3 setting into different clusters to predict the speedup of a complete sequence of all the optimization clusters instead of having to deal with the ordering of more than 60 different individual optimizations. The predictive model uses (1) dynamic features, (2) an encoded version of the compiler sequence, and (3) an exploration heuristic to tackle the problem. Experimental results using the LLVM compiler framework and the Cbench suite show the effectiveness of the proposed clustering and encoding techniques to application-based reordering of passes, while using a number of predictive models. We perform statistical analysis on the results and compare against (1) random iterative compilation, (2) standard optimization levels, and (3) two recent prediction approaches. We show that MiCOMP’s iterative compilation using its sub-sequences can reach an average performance speedup of 1.31 (up to 1.51). Additionally, we demonstrate that MiCOMP’s prediction model outperforms the -O1, -O2, and -O3 optimization levels within using just a few predictions and reduces the prediction error rate down to only 5%. Overall, it achieves 90% of the available speedup by exploring less than 0.001% of the optimization space.

[1]  Kevin J. Johnson,et al.  Pattern recognition of jet fuels: comprehensive GC×GC with ANOVA-based feature selection and principal component analysis , 2002 .

[2]  Keith D. Cooper,et al.  ACME: adaptive compilation made efficient , 2005, LCTES '05.

[3]  João M. P. Cardoso,et al.  Compiler Phase Ordering as an Orthogonal Approach for Reducing Energy Consumption , 2018, ArXiv.

[4]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[5]  Suresh Purini,et al.  Finding good optimization sequences covering program space , 2013, TACO.

[6]  Amir H. Ashouri Compiler Autotuning using Machine Learning Techniques , 2016 .

[7]  David I. August,et al.  Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[8]  David B. Loveman,et al.  Program Improvement by Source-to-Source Transformation , 1977, J. ACM.

[9]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[10]  Carla E. Brodley,et al.  Learning to Schedule Straight-Line Code , 1997, NIPS.

[11]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[12]  Michael F. P. O'Boyle,et al.  Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).

[13]  Vittorio Zaccaria,et al.  A framework for Compiler Level statistical analysis over customized VLIW architecture , 2013, 2013 IFIP/IEEE 21st International Conference on Very Large Scale Integration (VLSI-SoC).

[14]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[15]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[16]  Prasad A. Kulkarni,et al.  Exploiting phase inter-dependencies for faster iterative compiler optimization phase order searches , 2013, 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).

[17]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[18]  Peter M. W. Knijnenburg,et al.  Iterative compilation in a non-linear optimisation space , 1998 .

[19]  Albert Cohen,et al.  Predictive modeling in a polyhedral optimization space , 2011, CGO 2011.

[20]  Michael F. P. O'Boyle,et al.  Using machine learning to focus iterative optimization , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[21]  Lior Rokach,et al.  Introduction to Recommender Systems Handbook , 2011, Recommender Systems Handbook.

[22]  Olivier Temam,et al.  Collective optimization: A practical collaborative approach , 2010, TACO.

[23]  Torsten Hoefler,et al.  Scientific Benchmarking of Parallel Computing Systems Twelve ways to tell the masses when reporting performance results , 2017 .

[24]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[25]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[26]  L. Almagor,et al.  Finding effective compilation sequences , 2004, LCTES '04.

[27]  Gianluca Palermo,et al.  Predictive modeling methodology for compiler phase-ordering , 2016, PARMA-DITAM '16.

[28]  Lieven Eeckhout,et al.  Deconstructing iterative optimization , 2012, TACO.

[29]  Michele Banko,et al.  Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing , 2001, HLT.

[30]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[31]  Sameer Kulkarni,et al.  Mitigating the compiler optimization phase-ordering problem using machine learning , 2012, OOPSLA '12.

[32]  Sankar K. Pal,et al.  Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[33]  Michael F. P. O'Boyle,et al.  OCEANS: Optimizing Compilers for Embedded Applications , 1997, Euro-Par.

[34]  John Cavazos,et al.  Using graph-based program characterization for predictive modeling , 2012, CGO '12.

[35]  John Cavazos,et al.  Inducing heuristics to decide whether to schedule , 2004, PLDI '04.

[36]  Alexandre C. B. Delbem,et al.  Clustering-Based Selection for the Exploration of Compiler Optimization Sequences , 2016, ACM Trans. Archit. Code Optim..

[37]  Gianluca Palermo,et al.  A Bayesian network approach for compiler auto-tuning for embedded processors , 2014, 2014 IEEE 12th Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia).

[38]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[39]  Sameer Kulkarni,et al.  An evaluation of different modeling techniques for iterative compilation , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).

[40]  Keith D. Cooper,et al.  Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.

[41]  Michael F. P. O'Boyle,et al.  Rapidly Selecting Good Compiler Optimizations using Performance Counters , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[42]  Steven R. Vegdahl Phase coupling and constant generation in an optimizing microcode compiler , 1982, MICRO 15.

[43]  João M. P. Cardoso,et al.  A graph-based iterative compiler pass selection and phase ordering approach , 2016, LCTES.

[44]  A Agresti,et al.  Modeling a Categorical Variable Allowing Arbitrarily Many Category Choices , 1999, Biometrics.

[45]  Joseph A. Konstan,et al.  Introduction to recommender systems , 2008, SIGMOD Conference.

[46]  Gianluca Palermo,et al.  A Survey on Compiler Autotuning using Machine Learning , 2018, ACM Comput. Surv..

[47]  Lior Rokach,et al.  Recommender Systems Handbook , 2010 .

[48]  Saman P. Amarasinghe,et al.  Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.

[49]  Deli Zhao,et al.  Agglomerative clustering via maximum incremental path integral , 2013, Pattern Recognit..

[50]  Michael Stepp,et al.  Equality saturation: a new approach to optimization , 2009, POPL '09.

[51]  Gianluca Palermo,et al.  COBAYN: Compiler Autotuning Framework Using Bayesian Networks , 2016, ACM Trans. Archit. Code Optim..

[52]  João M. P. Cardoso,et al.  Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration , 2015, SCOPES.

[53]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[54]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[55]  Gary S. Tyson,et al.  Practical exhaustive optimization phase order exploration and evaluation , 2009, TACO.

[56]  Lieven Eeckhout,et al.  Microarchitecture-Independent Workload Characterization , 2007, IEEE Micro.