A Survey on Compiler Autotuning using Machine Learning

Since the mid-1990s, researchers have been trying to use machine-learning-based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations, and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches, and finally, the influential papers of the field.

[1]  Chantal Ykman-Couvreur,et al.  MULTICUBE: Multi-objective Design Space Exploration of Multi-core Architectures , 2010, 2010 IEEE Computer Society Annual Symposium on VLSI.

[2]  Richard M. Stallman,et al.  Using The Gnu Compiler Collection: A Gnu Manual For Gcc Version 4.3.3 , 2009 .

[3]  Steven R. Vegdahl Phase coupling and constant generation in an optimizing microcode compiler , 1982, MICRO 15.

[4]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[5]  Lakshmi Sobhana Kalli,et al.  Market-Oriented Cloud Computing : Vision , Hype , and Reality for Delivering IT Services as Computing , 2013 .

[6]  Bruce R. Schatz,et al.  An Overview of the Production-Quality Compiler-Compiler Project , 1980, Computer.

[7]  Uwe Aßmann,et al.  Cosy Compiler Phase Embedding with the CoSy Compiler Model , 1994, CC.

[8]  Gianluca Palermo,et al.  SOCRATES — A seamless online compiler and system runtime autotuning framework for energy-aware applications , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9]  G A MartinsLuiz,et al.  Exploration of compiler optimization sequences using clustering-based selection , 2014 .

[10]  Michele Tartara,et al.  Parallel iterative compilation: using MapReduce to speedup machine learning in compilers , 2012, MapReduce '12.

[11]  Peter M. W. Knijnenburg,et al.  Iterative compilation in a non-linear optimisation space , 1998 .

[12]  Albert Cohen,et al.  Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[13]  Peter M. W. Knijnenburg,et al.  Statistical selection of compiler options , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[14]  Olivier Temam,et al.  Collective Optimization , 2008, HiPEAC.

[15]  Pavlos Petoumenos,et al.  Iterative compilation on mobile devices , 2015, ArXiv.

[16]  Scott Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.

[17]  João M. P. Cardoso,et al.  A graph-based iterative compiler pass selection and phase ordering approach , 2016, LCTES.

[18]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[19]  Michael F. P. O'Boyle,et al.  Hybrid Optimizations: Which Optimization Algorithm to Use? , 2006, CC.

[20]  John Cavazos,et al.  HERCULES: Strong Patterns towards More Intelligent Predictive Modeling , 2014, 2014 43rd International Conference on Parallel Processing.

[21]  Adl-TabatabaiAli-Reza,et al.  Fast, effective code generation in a just-in-time Java compiler , 1998 .

[22]  Sameer Kulkarni,et al.  Mitigating the compiler optimization phase-ordering problem using machine learning , 2012, OOPSLA '12.

[23]  Gianluca Palermo,et al.  MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning , 2017, TACO.

[24]  Michael F. P. O'Boyle,et al.  Evaluating Iterative Compilation , 2002, LCPC.

[25]  Andrew G. Barto,et al.  Building a Basic Block Instruction Scheduler with Reinforcement Learning and Rollouts , 2002, Machine Learning.

[26]  Mary W. Hall,et al.  Automating Compiler-Directed Autotuning for Phased Performance Behavior , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[27]  Roberto Santana,et al.  Evolutionary Optimization of Compiler Flag Selection by Learning and Exploiting Flags Interactions , 2016, GECCO.

[28]  Vittorio Zaccaria,et al.  Multicube Explorer: An Open Source Framework for Design Space Exploration of Chip Multi-Processors , 2010, ARCS Workshops.

[29]  Francky Catthoor,et al.  Energy-aware compilation and hardware design for VLIW embedded systems , 2007, Int. J. Embed. Syst..

[30]  SilvanoCristina,et al.  Multi-objective design space exploration of embedded systems , 2005 .

[31]  G. Ascia,et al.  A system-level framework for evaluating area/performance/power trade-offs of VLIW-based embedded systems , 2005, Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005..

[32]  Chris Eagle,et al.  The IDA Pro Book: The Unofficial Guide to the World's Most Popular Disassembler , 2008 .

[33]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[34]  O. Chapelle,et al.  Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews] , 2009, IEEE Transactions on Neural Networks.

[35]  Jason Mars,et al.  Scenario Based Optimization: A Framework for Statically Enabling Online Optimizations , 2009, 2009 International Symposium on Code Generation and Optimization.

[36]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[37]  Lieven Eeckhout,et al.  Evaluating iterative optimization across 1000 datasets , 2010, PLDI '10.

[38]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[39]  Michael F. P. O'Boyle,et al.  Reducing Training Time in a One-Shot Machine Learning-Based Compiler , 2009, LCPC.

[40]  Zhi Chen,et al.  An empirical study of the effect of source-level loop transformations on compiler stability , 2018, Proc. ACM Program. Lang..

[41]  Prasad A. Kulkarni,et al.  Exploiting phase inter-dependencies for faster iterative compiler optimization phase order searches , 2013, 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).

[42]  Samuel Williams,et al.  Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers , 2017, Parallel Comput..

[43]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[44]  Michael F. P. O'Boyle,et al.  Automatic Feature Generation for Machine Learning Based Optimizing Compilation , 2009, 2009 International Symposium on Code Generation and Optimization.

[45]  Alexander Aiken,et al.  Stochastic optimization of floating-point programs with tunable precision , 2014, PLDI.

[46]  F P O'BoyleMichael,et al.  Method-specific dynamic compilation using logistic regression , 2006 .

[47]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[48]  Kalyan Veeramachaneni,et al.  Autotuning algorithmic choice for input sensitivity , 2015, PLDI.

[49]  Christopher W. Fraser Automatic inference of models for statistical code compression , 1999, PLDI '99.

[50]  Albert Cohen,et al.  A Practical Method for Quickly Evaluating Program Optimizations , 2005, HiPEAC.

[51]  Giovanni De Micheli,et al.  High Level Synthesis of ASlCs un - der Timing and Synchronization Constraints , 1992 .

[52]  Gaetano Borriello,et al.  Location Systems for Ubiquitous Computing , 2001, Computer.

[53]  R. Schaller,et al.  Moore's law: past, present and future , 1997 .

[54]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[55]  Bruce Thompson,et al.  "Statistical," "practical", and "clinical": How many kinds of significance do counselors need to consider? , 2002 .

[56]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[57]  Cédric Bastoul,et al.  Predictive Modeling in a Polyhedral Optimization Space , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[58]  Carla E. Brodley,et al.  Learning to Schedule Straight-Line Code , 1997, NIPS.

[59]  Uday Bondhugula,et al.  Combined iterative and model-driven optimization in an automatic parallelization framework , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[60]  David A. Patterson,et al.  Computer Organization and Design, Fifth Edition: The Hardware/Software Interface , 2013 .

[61]  Vivek Sarkar,et al.  Compiling and Optimizing Java 8 Programs for GPU Execution , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).

[62]  Anna Sikora,et al.  AutoTune: A Plugin-Driven Approach to the Automatic Tuning of Parallel Applications , 2012, PARA.

[63]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[64]  Michael F. P. O'Boyle,et al.  Milepost GCC: Machine Learning Enabled Self-tuning Compiler , 2011, International Journal of Parallel Programming.

[65]  Vivek Sarkar,et al.  Automatic selection of high-order transformations in the IBM XL FORTRAN compilers , 1997, IBM J. Res. Dev..

[66]  Lieven Eeckhout,et al.  Cole: compiler optimization level exploration , 2008, CGO '08.

[67]  Leslie Pérez Cáceres,et al.  Automatic Configuration of GCC Using Irace , 2017, Artificial Evolution.

[68]  J. Eliot B. Moss,et al.  Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts , 1998, NIPS.

[69]  Michael F. P. O'Boyle,et al.  A Feasibility Study in Iterative Compilation , 1999, ISHPC.

[70]  Gary S. Tyson,et al.  Evaluating Heuristic Optimization Phase Order Search Algorithms , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[71]  Keith D. Cooper,et al.  Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.

[72]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[73]  W WallDavid Limits of instruction-level parallelism , 1991 .

[74]  Ali-Reza Adl-Tabatabai,et al.  Fast, effective code generation in a just-in-time Java compiler , 1998, PLDI.

[75]  Chantal Ykman-Couvreur,et al.  MULTICUBE: Multi-objective Design Space Exploration of Multi-core Architectures , 2010, ISVLSI.

[76]  Rudolf Eigenmann,et al.  Fast and effective orchestration of compiler optimizations for automatic performance tuning , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[77]  Richard M. Stallman,et al.  Using and Porting the GNU Compiler Collection , 2000 .

[78]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[79]  Mary Lou Soffa,et al.  Automatic generation of global optimizers , 1991, PLDI '91.

[80]  Katharina Morik,et al.  Automatic WCET Reduction by Machine Learning Based Heuristics for Function Inlining , 2013 .

[81]  Brian Jeff Big.LITTLE system architecture from ARM: saving power through heterogeneous multiprocessing and task context migration , 2012, DAC.

[82]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[83]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[84]  Michael F. P. O'Boyle,et al.  Method-specific dynamic compilation using logistic regression , 2006, OOPSLA '06.

[85]  WhalleyDavid,et al.  Fast searches for effective optimization phase sequences , 2004 .

[86]  Lifan Xu,et al.  Auto-tuning a high-level language targeted to GPU codes , 2012, 2012 Innovative Parallel Computing (InPar).

[87]  Mary Lou Soffa,et al.  A model-based framework: an approach for profit-driven optimization , 2005, International Symposium on Code Generation and Optimization.

[88]  Lothar Thiele,et al.  Multi-objective Exploration of Compiler Optimizations for Real-Time Systems , 2010, 2010 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing.

[89]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[90]  Gianluca Palermo,et al.  Selecting the Best Compiler Optimizations: A Bayesian Network Approach , 2018 .

[91]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.

[92]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[93]  John Cavazos,et al.  Inducing heuristics to decide whether to schedule , 2004, PLDI '04.

[94]  David A. Padua,et al.  Advanced compiler optimizations for supercomputers , 1986, CACM.

[95]  Victor R. Basili,et al.  Iterative enhancement: A practical technique for software development , 1975, IEEE Transactions on Software Engineering.

[96]  Anton Kindestam Graph-based features for machine learning driven code optimization , 2017 .

[97]  Gaetano Borriello,et al.  A Survey and Taxonomy of Location Systems for Ubiquitous Computing , 2001 .

[98]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[99]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning Through Evolving Neural Network Topologies , 2002, GECCO.

[100]  P. Sadayappan,et al.  Using machine learning to improve automatic vectorization , 2012, TACO.

[101]  Grigori Fursin,et al.  Finding representative sets of optimizations for adaptive multiversioning applications , 2009, ArXiv.

[102]  David B. Whalley,et al.  Improving both the performance benefits and speed of optimization phase sequence searches , 2010, LCTES '10.

[103]  Paul B. Schneck,et al.  A survey of compiler optimization techniques , 1973, ACM Annual Conference.

[104]  Albert Cohen,et al.  Iterative optimization in the polyhedral model: part ii, multidimensional time , 2008, PLDI '08.

[105]  Gareth Halfacree,et al.  Raspberry Pi User Guide - Turtleback School & Library Binding Edition , 2014 .

[106]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[107]  Vittorio Zaccaria,et al.  Multi-objective design space exploration of embedded systems , 2003, J. Embed. Comput..

[108]  Chun Chen,et al.  A scalable auto-tuning framework for compiler optimization , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[109]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[110]  Alexandre C. B. Delbem,et al.  Exploration of compiler optimization sequences using clustering-based selection , 2014, LCTES '14.

[111]  Michael F. P. O'Boyle,et al.  Rapidly Selecting Good Compiler Optimizations using Performance Counters , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[112]  Todd Waterman Adaptive compilation and inlining , 2006 .

[113]  Mark Stephenson,et al.  Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.

[114]  Gianluca Palermo,et al.  The Phase-Ordering Problem: A Complete Sequence Prediction Approach , 2018 .

[115]  Keshav Pingali,et al.  Compiler research: the next 50 years , 2009, CACM.

[116]  Anke Schmid,et al.  The Design Of An Optimizing Compiler , 2016 .

[117]  David Padua,et al.  A Matlab Just-In-time Compiler , 2000 .

[118]  Tarek S. Abdelrahman,et al.  Genesis: a language for generating synthetic training programs for machine learning , 2015, Conf. Computing Frontiers.

[119]  Keith D. Cooper,et al.  Combining analyses, combining optimizations , 1995, TOPL.

[120]  Lieven Eeckhout,et al.  Automated just-in-time compiler tuning , 2010, CGO '10.

[121]  Richard Craig Van Nostrand,et al.  Design of Experiments Using the Taguchi Approach: 16 Steps to Product and Process Improvement , 2002, Technometrics.

[122]  K. J. Ottenstein,et al.  Data-flow graphs as an intermediate program form. , 1978 .

[123]  Alexandre C. B. Delbem,et al.  Clustering-Based Selection for the Exploration of Compiler Optimization Sequences , 2016, ACM Trans. Archit. Code Optim..

[124]  Richard L. Gorsuch Exploratory Factor Analysis , 1988 .

[125]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[126]  Peter M. W. Knijnenburg,et al.  Optimizing general purpose compiler optimization , 2005, CF '05.

[127]  Albert Cohen,et al.  The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.

[128]  Sameer Kulkarni,et al.  An evaluation of different modeling techniques for iterative compilation , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).

[129]  Luca Benini,et al.  Autotuning and adaptivity approach for energy efficient Exascale HPC systems: The ANTAREX approach , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[130]  Mary W. Hall,et al.  Towards making autotuning mainstream , 2013, Int. J. High Perform. Comput. Appl..

[131]  Vittorio Zaccaria,et al.  A correlation-based design space exploration methodology for multi-processor systems-on-chip , 2010, Design Automation Conference.

[132]  Guy L. Steele,et al.  Java(TM) Language Specification , 2005 .

[133]  Chun Chen,et al.  Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy , 2005, International Symposium on Code Generation and Optimization.

[134]  Olivier Temam,et al.  Collective optimization: A practical collaborative approach , 2010, TACO.

[135]  Torsten Hoefler,et al.  Scientific Benchmarking of Parallel Computing Systems Twelve ways to tell the masses when reporting performance results , 2017 .

[136]  L. Almagor,et al.  Finding effective compilation sequences , 2004, LCTES '04.

[137]  Gianluca Palermo,et al.  Predictive modeling methodology for compiler phase-ordering , 2016, PARMA-DITAM '16.

[138]  FrankeBjörn,et al.  Probabilistic source-level optimisation of embedded programs , 2005 .

[139]  J ChaitinGregory,et al.  Register allocation via coloring , 1981 .

[140]  Feilong Tang,et al.  Feature Mining for Machine Learning Based Compilation Optimization , 2014, 2014 Eighth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[141]  Hui Liu,et al.  ALIC: A Low Overhead Compiler Optimization Prediction Model , 2018, Wirel. Pers. Commun..

[142]  Chris Cummins,et al.  Autotuning OpenCL Workgroup Size for Stencil Patterns , 2015, ArXiv.

[143]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[144]  Doran Wilde,et al.  A LIBRARY FOR DOING POLYHEDRAL OPERATIONS , 2000 .

[145]  Oliver Ray,et al.  Automatically Tuning the GCC Compiler to Optimize the Performance of Applications Running on Embedded Systems , 2017 .

[146]  Michael F. P. O'Boyle,et al.  Portable compiler optimisation across embedded programs and microarchitectures using machine learning , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[147]  SilvanoCristina,et al.  A Survey on Compiler Autotuning using Machine Learning , 2018 .

[148]  Mary Lou Soffa,et al.  Predicting the impact of optimizations for embedded systems , 2003, LCTES '03.

[149]  BasuProtonu,et al.  Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers , 2017, ParCo 2017.

[150]  Gianluca Palermo,et al.  Automatic Tuning of Compilers Using Machine Learning , 2018, SpringerBriefs in Applied Sciences and Technology.

[151]  Gareth Halfacree,et al.  Raspberry Pi User Guide , 2012 .

[152]  João M. P. Cardoso,et al.  Compiler Phase Ordering as an Orthogonal Approach for Reducing Energy Consumption , 2018, ArXiv.

[153]  Michael F. P. O'Boyle,et al.  OCEANS: Optimizing Compilers for Embedded Applications , 1997, Euro-Par.

[154]  Giovanni De Micheli,et al.  Design Space Exploration , 1992 .

[155]  David W. Wall,et al.  Limits of instruction-level parallelism , 1991, ASPLOS IV.

[156]  Gianluca Palermo,et al.  The Phase-Ordering Problem: An Intermediate Speedup Prediction Approach , 2018 .

[157]  Kerstin Eder,et al.  A logic programming approach to predict effective compiler settings for embedded software , 2015, Theory and Practice of Logic Programming.

[158]  Tomofumi Yuki,et al.  AlphaZ: A System for Design Space Exploration in the Polyhedral Model , 2012, LCPC.

[159]  KulkarniSameer,et al.  Mitigating the compiler optimization phase-ordering problem using machine learning , 2012 .

[160]  P. Feautrier Parametric integer programming , 1988 .

[161]  Donald J. Patterson,et al.  Computer organization and design: the hardware-software interface (appendix a , 1993 .

[162]  Saman P. Amarasinghe,et al.  Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.

[163]  Grigori Fursin,et al.  Collective Mind, Part II: Towards Performance- and Cost-Aware Software Engineering as a Natural Science , 2015, ArXiv.

[164]  Gerald Tesauro,et al.  On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[165]  F P O'BoyleMichael,et al.  Mapping parallelism to multi-cores , 2009 .

[166]  Vivek Sarkar Optimized Unrolling of Nested Loops , 2004, International Journal of Parallel Programming.

[167]  Jack J. Dongarra,et al.  A Note on Auto-tuning GEMM for GPUs , 2009, ICCS.

[168]  Michael F. P. O'Boyle,et al.  MiDataSets: Creating the Conditions for a More Realistic Evaluation of Iterative Optimization , 2007, HiPEAC.

[169]  Matthew E. Taylor,et al.  Feature selection and policy optimization for distributed instruction placement using reinforcement learning , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[170]  Grigori Fursin,et al.  A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques , 2018, ArXiv.

[171]  Amir H. Ashouri Compiler Autotuning using Machine Learning Techniques , 2016 .

[172]  Yoshiaki Fukazawa,et al.  A method for estimating optimal unrolling times for nested loops , 1997, Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks (I-SPAN'97).

[173]  David B. Loveman,et al.  Program Improvement by Source-to-Source Transformation , 1977, J. ACM.

[174]  Grigori Fursin,et al.  Probabilistic source-level optimisation of embedded programs , 2005, LCTES '05.

[175]  John Cocke,et al.  Register Allocation Via Coloring , 1981, Comput. Lang..

[176]  Mary Lou Soffa,et al.  An approach for exploring code improving transformations , 1997, TOPL.

[177]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[178]  Grigori Fursin,et al.  Crowdtuning: systematizing auto-tuning using predictive modeling and crowdsourcing , 2013, PARCO.

[179]  Oscar R. Hernandez,et al.  HERCULES: A Pattern Driven Code Transformation System , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[180]  Uday Bondhugula,et al.  Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model , 2008, CC.

[181]  Michael F. P. O'Boyle,et al.  Integrating algorithmic parameters into benchmarking and design space exploration in 3D scene understanding , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).

[182]  Vittorio Zaccaria,et al.  A system-level methodology for fast multi-objective design space exploration , 2003, GLSVLSI '03.

[183]  Michael F. P. O'Boyle,et al.  Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.

[184]  Michael F. P. O'Boyle,et al.  Automatic Tuning of Inlining Heuristics , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[185]  Ben H. H. Juurlink,et al.  Stencil Autotuning with Ordinal Regression: Extended Abstract , 2017, SCOPES.

[186]  Michael F. P. O'Boyle,et al.  Using machine learning to focus iterative optimization , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[187]  Lior Rokach,et al.  Introduction to Recommender Systems Handbook , 2011, Recommender Systems Handbook.

[188]  Mary Lou Soffa,et al.  Incremental global optimization for faster recompilations , 1990, Proceedings. 1990 International Conference on Computer Languages.

[189]  Gianluca Palermo,et al.  An Evaluation of Autotuning Techniques for the Compiler Optimization Problems , 2016, RES4ANT@DATE.

[190]  Lieven Eeckhout,et al.  Microarchitecture-Independent Workload Characterization , 2007, IEEE Micro.

[191]  Frances E. Allen,et al.  Control-flow analysis , 2022 .

[192]  Eunjung Park,et al.  Automatic selection of compiler optimizations using program characterization and machine learning , 2015 .

[193]  Gianluca Palermo,et al.  Design Space Exploration of Compiler Passes: A Co-Exploration Approach for the Embedded Domain , 2018 .

[194]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[195]  BastoulCédric,et al.  Iterative optimization in the polyhedral model , 2008 .

[196]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[197]  J. Cavazos,et al.  Partnership for Advanced Computing in Europe Performance Improvement in Kernels by Guiding Compiler Auto-Vectorization Heuristics , 2014 .

[198]  D CooperKeith,et al.  Optimizing for reduced code space using genetic algorithms , 1999 .

[199]  Michael F. P. O'Boyle,et al.  Mapping parallelism to multi-cores: a machine learning based approach , 2009, PPoPP '09.

[200]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[201]  Matthieu Stéphane Benoit Queva Phase-ordering in optimizing compilers , 2007 .

[202]  Michael J. Schulte,et al.  The Interval-Enhanced GNU Fortran Compiler , 1999, Reliab. Comput..

[203]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[204]  Chris Cummins,et al.  End-to-End Deep Learning of Optimization Heuristics , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[205]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[206]  Thierry Moreau,et al.  Introducing ReQuEST: an Open Platform for Reproducible and Quality-Efficient Systems-ML Tournaments , 2018, ArXiv.

[207]  R. S. Laundy,et al.  Multiple Criteria Optimisation: Theory, Computation and Application , 1989 .

[208]  Vincent Loechner PolyLib: A Library for Manipulating Parameterized Polyhedra , 1999 .

[209]  John Aycock,et al.  A brief history of just-in-time , 2003, CSUR.

[210]  Stefano Crespi-Reghizzi,et al.  Continuous learning of compiler heuristics , 2013, TACO.

[211]  T. Kisuki,et al.  Iterative Compilation in Program Optimization , 2000 .

[212]  John Cavazos,et al.  Using graph-based program characterization for predictive modeling , 2012, CGO '12.

[213]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[214]  Mary Lou Soffa,et al.  An approach to ordering optimizing transformations , 1990, PPOPP '90.

[215]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[216]  Yosi Ben-Asher,et al.  A Study of Conflicting Pairs of Compiler Optimizations , 2017, 2017 IEEE 11th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC).

[217]  Sverre Jarp A Methodology for using the Itanium-2 Performance Counters for Bottleneck Analysis , 2002 .

[218]  Anne C. Elster,et al.  Machine Learning Based Auto-Tuning for Enhanced OpenCL Performance Portability , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[219]  Luca Benini,et al.  ANTAREX -- AutoTuning and Adaptivity appRoach for Energy Efficient eXascale HPC Systems , 2015, 2015 IEEE 18th International Conference on Computational Science and Engineering.

[220]  David A. Padua,et al.  MaJIC: A Matlab Just-In-time Compiler , 2000, LCPC.

[221]  Luca Benini,et al.  Autotuning and adaptivity in energy efficient HPC systems: the ANTAREX toolbox , 2018, CF.

[222]  J. Larmouth Fortran 77 portability , 1981, Softw. Pract. Exp..

[223]  Gianluca Palermo,et al.  A Bayesian network approach for compiler auto-tuning for embedded processors , 2014, 2014 IEEE 12th Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia).

[224]  Uday Bondhugula,et al.  PLuTo: A Practical and Fully Automatic Polyhedral Program Optimization System , 2015 .

[225]  Uzay Kaymak,et al.  Improved covariance estimation for Gustafson-Kessel clustering , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[226]  Michael F. P. O'Boyle,et al.  Automatic performance model construction for the fast software exploration of new hardware designs , 2006, CASES '06.

[227]  Karl-Erik Årzén,et al.  CONTROL AND EMBEDDED COMPUTING: SURVEY OF RESEARCH DIRECTIONS , 2005 .

[228]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[229]  Mary W. Hall,et al.  CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .

[230]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[231]  K. Cooper,et al.  Compilation Order Matters , 2001 .

[232]  Keith D. Cooper,et al.  ACME: adaptive compilation made efficient , 2005, LCTES '05.

[233]  Juliane Junker,et al.  Computer Organization And Design The Hardware Software Interface , 2016 .

[234]  Pavlos Petoumenos,et al.  Minimizing the cost of iterative compilation with active learning , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[235]  Gary S. Tyson,et al.  Exhaustive optimization phase order space exploration , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[236]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[237]  Suresh Purini,et al.  Finding good optimization sequences covering program space , 2013, TACO.

[238]  David I. August,et al.  Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[239]  Kerstin Eder,et al.  Less is More: Exploiting the Standard Compiler Optimization Levels for Better Performance and Energy Consumption , 2018, SCOPES.

[240]  Satoshi Matsuoka,et al.  OpenJIT: An Open-Ended, Reflective JIT Compiler Framework for Java , 2000, ECOOP.

[241]  P. Faraboschi,et al.  VLIW processors: once blue sky, now commonplace , 2009, IEEE Solid-State Circuits Magazine.

[242]  Toshiaki Yasue,et al.  Overview of the IBM Java Just-in-Time Compiler , 2000, IBM Syst. J..

[243]  Ranjit K. Roy,et al.  Design of Experiments Using The Taguchi Approach: 16 Steps to Product and Process Improvement , 2001 .

[244]  Y. N. Srikant,et al.  Microarchitecture Sensitive Empirical Models for Compiler Optimizations , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[245]  Geoffrey Brown,et al.  ρ-VEX: A reconfigurable and extensible softcore VLIW processor , 2008, 2008 International Conference on Field-Programmable Technology.

[246]  Albert Cohen,et al.  Practical aggregation of semantical program properties for machine learning based optimization , 2010, CASES '10.

[247]  Ronald A. Howard,et al.  Dynamic Programming , 1966 .

[248]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[249]  John Cavazos,et al.  HSLOT: The HERCULES Scriptable Loop Transformations Engine , 2014, 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing.

[250]  Rudolf Eigenmann,et al.  Rating Compiler Optimizations for Automatic Performance Tuning , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[251]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[252]  Keith D. Cooper,et al.  Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.

[253]  Sameer Kulkarni,et al.  Automatic construction of inlining heuristics using machine learning , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[254]  Grigori Fursin,et al.  Collective Knowledge: Towards R&D sustainability , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[255]  Gary S. Tyson,et al.  Practical exhaustive optimization phase order exploration and evaluation , 2009, TACO.

[256]  Paolo Faraboschi,et al.  Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .

[257]  Michael F. P. O'Boyle,et al.  Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation , 2004, The Journal of Supercomputing.

[258]  Stefan M. Freudenberger,et al.  Phase Ordering of Register Allocation and Instruction Scheduling , 1991, Code Generation.

[259]  Vasilios I. Kelefouras,et al.  A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details , 2017, Computing.

[260]  Luca Benini,et al.  The ANTAREX approach to autotuning and adaptivity for energy efficient HPC systems , 2016, Conf. Computing Frontiers.

[261]  Albert Cohen,et al.  Building a Practical Iterative Interactive Compiler , 2007 .

[262]  Luca Benini,et al.  The ANTAREX tool flow for monitoring and autotuning energy efficient HPC systems , 2017, 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).

[263]  João M. P. Cardoso,et al.  Impact of Compiler Phase Ordering When Targeting GPUs , 2017, Euro-Par Workshops.

[264]  Lieven Eeckhout,et al.  Deconstructing iterative optimization , 2012, TACO.

[265]  Gianluca Palermo,et al.  COBAYN: Compiler Autotuning Framework Using Bayesian Networks , 2016, ACM Trans. Archit. Code Optim..

[266]  João M. P. Cardoso,et al.  Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration , 2015, SCOPES.

[267]  Mark Stephenson,et al.  Automating the construction of compiler heuristics using machine learning , 2006 .

[268]  José Nelson Amaral,et al.  Using machines to learn method-specific compilation strategies , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[269]  Michael F. P. O'Boyle,et al.  Fast compiler optimisation evaluation using code-feature based performance prediction , 2007, CF '07.

[270]  Guy L. Steele,et al.  Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley)) , 2005 .

[271]  James Demmel,et al.  Statistical Models for Empirical Search-Based Performance Tuning , 2004, Int. J. High Perform. Comput. Appl..

[272]  Michael F. P. O'Boyle,et al.  MILEPOST GCC: machine learning based research compiler , 2008 .

[273]  Grigori Fursin,et al.  Iterative compilation and performance prediction for numerical applications , 2004 .

[274]  Vittorio Zaccaria,et al.  A framework for Compiler Level statistical analysis over customized VLIW architecture , 2013, 2013 IFIP/IEEE 21st International Conference on Very Large Scale Integration (VLSI-SoC).

[275]  Agnieszka Kaminska,et al.  Statistical models to accelerate software development by means of iterative compilation , 2016, Comput. Sci..

[276]  Kyoung-jae Kim,et al.  Financial time series forecasting using support vector machines , 2003, Neurocomputing.

[277]  Simon J. Hollis,et al.  Identifying Compiler Options to Minimize Energy Consumption for Embedded Platforms , 2013, Comput. J..

[278]  Una-May O'Reilly,et al.  Genetic Programming Applied to Compiler Heuristic Optimization , 2003, EuroGP.

[279]  D. K. Arvind,et al.  Languages and Compilers for Parallel Computing , 2014, Lecture Notes in Computer Science.

[280]  Gustavo Camps-Valls,et al.  Semi-Supervised Graph-Based Hyperspectral Image Classification , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[281]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[282]  Christopher C. Cummins,et al.  Synthesizing benchmarks for predictive modeling , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[283]  John Cavazos,et al.  Energy Auto-Tuning using the Polyhedral Approach , 2014 .

[284]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[285]  Lieven Eeckhout,et al.  Practical Iterative Optimization for the Data Center , 2015, ACM Trans. Archit. Code Optim..

[286]  Oliver Ray,et al.  Automatically Tuning the GCC Compiler to Optimize the Performance of Applications Running on the ARM Cortex-M3 , 2017, ArXiv.

[287]  Alan Edelman,et al.  PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.