Using machine learning to predict the code size impact of duplication heuristics in a dynamic compiler

Code duplication is a major opportunity to enable optimizations in subsequent compiler phases. However, duplicating code prematurely or too liberally can result in tremendous code size increases. Thus, modern compilers use trade-offs between estimated costs in terms of code size increase and benefits in terms of performance increase. In the context of this ongoing research project, we propose the use of machine learning to provide trade-off functions with accurate predictions for code size impact. To evaluate our approach, we implemented a neural network predictor in the GraalVM compiler and compared its performance against a human-crafted, highly tuned heuristic. First results show promising performance improvements, leading to code size reductions of more than 10% for several benchmarks. Additionally, we present an assistance mode for finding flaws in the human-crafted heuristic, leading to improvements for the duplication optimization itself.

[1]  ChambersCraig,et al.  Debugging optimized code with dynamic deoptimization , 1992 .

[2]  Stefano Crespi-Reghizzi,et al.  Continuous learning of compiler heuristics , 2013, TACO.

[3]  Mark Stephenson,et al.  Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.

[4]  Mira Mezini,et al.  Da capo con scala: design and analysis of a scala benchmark suite for the java virtual machine , 2011, OOPSLA '11.

[5]  Sameer Kulkarni,et al.  Automatic construction of inlining heuristics using machine learning , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[6]  Hanspeter Mössenböck,et al.  An intermediate representation for speculative optimizations in a dynamic compiler , 2013, VMIL '13.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Jeronimo Castrillon,et al.  Compiler-based graph representations for deep learning models of code , 2020, CC.

[9]  Michael F. P. O'Boyle,et al.  MILEPOST GCC: machine learning based research compiler , 2008 .

[10]  Vivek Sarkar Optimized unrolling of nested loops , 2000, ICS '00.

[11]  Michael F. P. O'Boyle,et al.  Automatic Tuning of Inlining Heuristics , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12]  Vojin Jovanovic,et al.  One compiler: deoptimization to optimized code , 2017, CC.

[13]  Hanspeter Mössenböck,et al.  Graal IR : An Extensible Declarative Intermediate Representation , 2013 .

[14]  Christian Wimmer,et al.  One VM to rule them all , 2013, Onward!.

[15]  Zheng Wang,et al.  Machine Learning in Compiler Optimization , 2018, Proceedings of the IEEE.

[16]  Hanspeter Mössenböck,et al.  Sulong - execution of LLVM-based languages on the JVM: position paper , 2016, ICOOOLPS@ECOOP.

[17]  Peter M. W. Knijnenburg,et al.  Iterative compilation in a non-linear optimisation space , 1998 .

[18]  Ion Stoica,et al.  NeuroVectorizer: end-to-end vectorization with deep reinforcement learning , 2020, CGO.

[19]  Keith D. Cooper,et al.  Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.

[20]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[21]  Gianluca Palermo,et al.  A Survey on Compiler Autotuning using Machine Learning , 2018, ACM Comput. Surv..

[22]  Stephen J. Fink,et al.  A comparative study of static and profile-based heuristics for inlining , 2000, DYNAMO '00.

[23]  F SweeneyPeter,et al.  A comparative study of static and profile-based heuristics for inlining , 2000 .

[24]  Michael Carbin,et al.  Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks , 2018, ICML.

[25]  Hanspeter Mössenböck,et al.  Compilation queuing and graph caching for dynamic compilers , 2012, VMIL '12.

[26]  Andrea Rosà,et al.  Renaissance: benchmarking suite for parallel applications on the JVM , 2019, PLDI.

[27]  Craig Chambers,et al.  Debugging optimized code with dynamic deoptimization , 1992, PLDI '92.

[28]  Chris Cummins,et al.  Machine Learning in Compilers: Past, Present and Future , 2020, 2020 Forum for Specification and Design Languages (FDL).

[29]  Michael F. P. O'Boyle,et al.  Milepost GCC: Machine Learning Enabled Self-tuning Compiler , 2011, International Journal of Parallel Programming.

[30]  Hanspeter Mössenböck,et al.  Dominance-based duplication simulation (DBDS): code duplication to enable compiler optimizations , 2018, CGO.

[31]  Michael F. P. O'Boyle,et al.  Fast compiler optimisation evaluation using code-feature based performance prediction , 2007, CF '07.

[32]  John Wawrzynek,et al.  AutoPhase: Compiler Phase-Ordering for HLS with Deep Reinforcement Learning , 2019, 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[33]  H. Abdi,et al.  Principal component analysis , 2010 .