A genetic algorithm based autotuning approach for performance and energy optimization

Autotuning is an empirical optimization approach in which the configuration space of an algorithmic code is explored in a systematic manner for a variety of software and hardware parameters. The objective of such autotuning is to reduce the computational time and/or energy requirements of the generated code. We develop a genetic algorithm based autotuning strategy that can be used for optimizing performance or energy or a combination thereof. The main advantage of our approach is that the number of possible compilations and executions that are explored in the configuration space is substantially smaller than exhaustive search. We demonstrate the usefulness of our approach to the underlying small matrix multiplication routines in spectral element solvers. The latter are an important class of higher order methods that are expected to be computationally intensive portion of next generation of large scale CFD simulations. Our experimental results were conducted of a variety of platforms. On AMD Fusion, for example, the genetic algorithm is able obtain 34% improvement in performance and 37% reduction in energy consumption over existing versions of the code. Further, a very small fraction of the entire configuration space needs to be explored.

[1]  James H. Laros,et al.  PowerInsight - A commodity power measurement capability , 2013, 2013 International Green Computing Conference Proceedings.

[2]  Chun Chen,et al.  Speeding up Nek5000 with autotuning and specialization , 2010, ICS '10.

[3]  Ian Karlin,et al.  User-Specified and Automatic Data Layout Selection for Portable Performance , 2013 .

[4]  Mary W. Hall,et al.  CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .

[5]  James Demmel,et al.  Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.

[6]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[7]  Katherine Yelick,et al.  OSKI: A library of automatically tuned sparse matrix kernels , 2005 .

[8]  Chun Chen,et al.  Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology , 2010, Software Automatic Tuning, From Concepts to State-of-the-Art Results.

[9]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[10]  Pradip Bose,et al.  Application-level power and performance characterization and optimization on IBM Blue Gene/Q systems , 2013, IBM J. Res. Dev..

[11]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[12]  Zhiling Lan,et al.  Measuring Power Consumption on IBM Blue Gene/Q , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[13]  S. Dosanjh,et al.  Architectures and Technology for Extreme Scale Computing Report from the Workshop Node Architecture and Power Reduction Strategies , 2011 .

[14]  Chun Chen,et al.  A scalable auto-tuning framework for compiler optimization , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[15]  I-Hsin Chung,et al.  Using Information from Prior Runs to Improve Automated Tuning Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[16]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[17]  John Holland,et al.  Adaptation in Natural and Artificial Sys-tems: An Introductory Analysis with Applications to Biology , 1975 .

[18]  Ian Karlin,et al.  Characterizing the Impact of Program Optimizations on Power and Energy for Explicit Hydrodynamics , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[19]  Ian H. Witten,et al.  Learning language using genetic algorithms , 1995, Learning for Natural Language Processing.

[20]  Kishan G. Mehrotra,et al.  Knowledge-based nonuniform crossover , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[21]  Kay Chen Tan,et al.  CAutoCSD-evolutionary search and optimisation enabled computer automated control system design , 2004, Int. J. Autom. Comput..

[22]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[23]  Erick Cantú-Paz,et al.  A Survey of Parallel Genetic Algorithms , 2000 .

[24]  P. Strevens Iii , 1985 .