Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications

Auto-tuners automate the performance tuning of parallel applications. Three major drawbacks of current approaches are 1) they mainly focus on numerical software; 2) they typically do not attempt to reduce the large search space before search algorithms are applied; 3) the means to provide an auto-tuner with additional information to improve tuning are limited. Our paper tackles these problems in a novel way by focusing on the interaction between an auto-tuner and a parallel application. In particular, we introduce Atune-IL, an instrumentation language that uses new types of code annotations to mark tuning parameters, blocks, permutation regions, and measuring points. Atune-IL allows a more accurate extraction of meta-information to help an auto-tuner prune the search space before employing search algorithms. In addition, Atune-IL's concepts target parallel applications in general, not just numerical programs. Atune-IL has been successfully evaluated in several case studies with parallel applications differing in size, programming language, and application domain; one case study employed a large commercial application with nested parallelism. On average, Atune-IL reduced search spaces by 78%. In two corner cases, 99% of the search space could be pruned.

[1]  Terence Parr A Functional Language For Generating Structured Text , 2006 .

[2]  Richard W. Vuduc,et al.  POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[3]  Ken Kennedy,et al.  Automatic tuning of whole applications using direct search and a performance-based transformation system , 2006, The Journal of Supercomputing.

[4]  Christoph Schaefer,et al.  Gene Expression with General Purpose Graph Rewriting Systems , 2009, Electron. Commun. Eur. Assoc. Softw. Sci. Technol..

[5]  Franz Franchetti,et al.  SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.

[6]  David A. Padua,et al.  A Language for the Compact Representation of Multiple Program Versions , 2005, LCPC.

[7]  Walter F. Tichy,et al.  Self-Tuning Parallelism , 2000, HPCN Europe.

[8]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[9]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  Walter F. Tichy,et al.  Software engineering for multicore systems: an experience report , 2008, IWMSE '08.

[11]  Vahid Tabatabaee,et al.  Parallel Parameter Tuning for Applications with Performance Variability , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12]  Takahiro Katagiri,et al.  FIBER: A Generalized Framework for Auto-tuning Software , 2003, ISHPC.

[13]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[14]  Tomàs Margalef,et al.  MATE: Monitoring, Analysis and Tuning Environment for parallel/distributed applications , 2007, Concurr. Comput. Pract. Exp..

[15]  Tomàs Margalef,et al.  Dynamic Performance Tuning Environment , 2001, Euro-Par.

[16]  Tomàs Margalef,et al.  MATE: Dynamic Performance Tuning Environment , 2004, Euro-Par.

[17]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[18]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[19]  Walter F. Tichy,et al.  Parallelizing Bzip2: A Case Study in Multicore Software Engineering , 2009, IEEE Software.

[20]  I-Hsin Chung,et al.  Using Information from Prior Runs to Improve Automated Tuning Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.