Bootstrapping Parameter Space Exploration for Fast Tuning

The task of tuning parameters for optimizing performance or other metrics of interest such as energy, variability, etc. can be resource and time consuming. Presence of a large parameter space makes a comprehensive exploration infeasible. In this paper, we propose a novel bootstrap scheme, called GEIST, for parameter space exploration to find performance-optimizing configurations quickly. Our scheme represents the parameter space as a graph whose connectivity guides information propagation from known configurations. Guided by the predictions of a semi-supervised learning method over the parameter graph, GEIST is able to adaptively sample and find desirable configurations using limited results from experiments. We show the effectiveness of GEIST for selecting application input options, compiler flags, and runtime/system settings for several parallel codes including LULESH, Kripke, Hypre, and OpenAtom.

[1]  Henry Hoffmann,et al.  Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques , 2016, ASPLOS 2016.

[2]  Archana Ganapathi,et al.  A case for machine learning to optimize multicore performance , 2009 .

[3]  Jeffrey K. Hollingsworth,et al.  ANGEL: A Hierarchical Approach to Multi-Objective Online Auto-Tuning , 2015, ROSS@HPDC.

[4]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[5]  Saurabh Bagchi,et al.  Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads , 2017, Middleware.

[6]  Lieven Eeckhout,et al.  RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration , 2016, IEEE Transactions on Parallel and Distributed Systems.

[7]  Peter N. Brown,et al.  KRIPKE - A MASSIVELY PARALLEL TRANSPORT MINI-APP , 2015 .

[8]  Pavlos Petoumenos,et al.  Minimizing the cost of iterative compilation with active learning , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[9]  Christos Faloutsos,et al.  CAMLP: Confidence-Aware Modulated Label Propagation , 2016, SDM.

[10]  Michael Gerndt,et al.  Automatic performance analysis with periscope , 2010 .

[11]  Prasanna Balaprakash,et al.  Exploiting Performance Portability in Search Algorithms for Autotuning , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[12]  Norbert Siegmund,et al.  Transfer learning for performance modeling of configurable systems: An exploratory analysis , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[13]  Lieven Eeckhout,et al.  Evaluating iterative optimization across 1000 datasets , 2010, PLDI '10.

[14]  Prasanna Balaprakash,et al.  Active-learning-based surrogate models for empirical performance tuning , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[15]  Michael Garland,et al.  Nitro: A Framework for Adaptive Code Variant Tuning , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[16]  Richard D. Hornung,et al.  The RAJA Portability Layer: Overview and Status , 2014 .

[17]  Abhishek Gupta,et al.  Parallel Programming with Migratable Objects: Charm++ in Practice , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Sven Apel,et al.  Performance Prediction of Multigrid-Solver Configurations , 2016, Software for Exascale Computing.

[19]  Robert D. Falgout,et al.  The Design and Implementation of hypre, a Library of Parallel High Performance Preconditioners , 2006 .

[20]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[21]  I-Hsin Chung,et al.  A Case Study Using Automatic Performance Tuning for Large-Scale Scientific Programs , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[22]  Ignacio Laguna,et al.  Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[23]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[24]  Rushil Anirudh,et al.  Performance Modeling under Resource Constraints Using Deep Transfer Learning , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[25]  Anne C. Elster,et al.  Machine learning‐based auto‐tuning for enhanced performance portability of OpenCL applications , 2017, Concurr. Comput. Pract. Exp..

[26]  Ananta Tiwari,et al.  Online Adaptive Code Generation and Tuning , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[27]  Thomas Fahringer,et al.  Multi-Objective Auto-Tuning with Insieme: Optimization and Trade-Off Analysis for Time, Energy and Resource Usage , 2014, Euro-Par.

[28]  Chun Chen,et al.  A scalable auto-tuning framework for compiler optimization , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[29]  O. Chapelle,et al.  Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews] , 2009, IEEE Transactions on Neural Networks.

[30]  Laxmikant V. Kalé,et al.  OpenAtom: Scalable Ab-Initio Molecular Dynamics with Diverse Capabilities , 2016, ISC.

[31]  Robert Ricci,et al.  Active Learning in Performance Analysis , 2016, 2016 IEEE International Conference on Cluster Computing (CLUSTER).