Spatial Based Feature Generation for Machine Learning Based Optimization Compilation

Modern compilers provide optimization options to obtain better performance for a given program. Effective selection of optimization options is a challenging task. Recent work has shown that machine learning can be used to select the best compiler optimization options for a given program. Machine learning techniques rely upon selecting features which represent a program in the best way. The quality of these features is critical to the performance of machine learning techniques. Previous work on feature selection for program representation is based on code size, mostly executed parts, parallelism and memory access patterns with-in a program. Spatial based information–how instructions are distributed with-in a program–has never been studied to generate features for the best compiler options selection using machine learning techniques. In this paper, we present a framework that address how to capture the spatial information with-in a program and transform it to features for machine learning techniques. An extensive experimentation is done using the SPEC2006 and MiBench benchmark applications. We compare our work with the IBM Milepost-gcc framework. The Milepost work gives a comprehensive set of features for using machine learning techniques for the best compiler options selection problem. Results show that the performance of machine learning techniques using spatial based features is better than the performance using the Milepost framework. With 66 available compiler options, we are also able to achieve 70% of the potential speed up obtained through an iterative compilation.

[1]  Michael F. P. O'Boyle,et al.  Portable compiler optimisation across embedded programs and microarchitectures using machine learning , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Sally A. McKee,et al.  Efficiently exploring architectural design spaces via predictive modeling , 2006, ASPLOS XII.

[3]  Yannis Manolopoulos,et al.  Structure-based similarity search with graph histograms , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[4]  Michael F. P. O'Boyle,et al.  Automatic Feature Generation for Machine Learning Based Optimizing Compilation , 2009, 2009 International Symposium on Code Generation and Optimization.

[5]  Michael F. P. O'Boyle,et al.  Method-specific dynamic compilation using logistic regression , 2006, OOPSLA '06.

[6]  Peter M. W. Knijnenburg,et al.  Iterative compilation in a non-linear optimisation space , 1998 .

[7]  Saman P. Amarasinghe,et al.  Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.

[8]  Rudolf Eigenmann,et al.  PEAK—a fast and effective performance tuning system via compiler optimization orchestration , 2008, TOPL.

[9]  J. Eliot B. Moss,et al.  Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts , 1998, NIPS.

[10]  John Cavazos,et al.  Inducing heuristics to decide whether to schedule , 2004, PLDI '04.

[11]  Keith D. Cooper,et al.  Optimizing for reduced code space using genetic algorithms , 1999, LCTES '99.

[12]  Michael F. P. O'Boyle,et al.  MILEPOST GCC: machine learning based research compiler , 2008 .

[13]  Michael F. P. O'Boyle,et al.  Proceedings of the 1998 Workshop on Profile and Feedback Directed Compilation (PFDC'98) , 1998 .

[14]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[15]  Michael F. P. O'Boyle,et al.  Using machine learning to focus iterative optimization , 2006, International Symposium on Code Generation and Optimization (CGO'06).