Parallel Parameter Tuning for Applications with Performance Variability

In this paper, we present parallel on-line optimization algorithms for parameter tuning of parallel programs. We employ direct search algorithms that update parameters based on real-time performance measurements. We discuss the impact of performance variability on the accuracy and efficiency of the optimization algorithms and proposed modified versions of the direct search algorithms to cope with it. The modified version uses multiple samples instead of single sample to estimate the performance more accurately. We present preliminary results that the performance variability of applications on clusters is heavy tailed. Finally, we studay and demonstrate the performance of the proposed algorithms for real scientific application.

[1]  K. I. M. McKinnon,et al.  Convergence of the Nelder-Mead Simplex Method to a Nonstationary Point , 1998, SIAM J. Optim..

[2]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[3]  Mahadev Satyanarayanan,et al.  Agile application-aware adaptation for mobility , 1997, SOSP.

[4]  Francine Berman,et al.  Scheduling from the perspective of the application , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[5]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[6]  Jeffrey S. Vetter,et al.  Autopilot: adaptive control of distributed applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[7]  Tamara G. Kolda,et al.  Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods , 2003, SIAM Rev..

[8]  Jeffrey C. Lagarias,et al.  Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions , 1998, SIAM J. Optim..

[9]  F. Petrini,et al.  The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[10]  V. Torczon,et al.  RANK ORDERING AND POSITIVE BASES IN PATTERN SEARCH ALGORITHMS , 1996 .

[11]  J. Dennis,et al.  Direct Search Methods on Parallel Machines , 1991 .

[12]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1997, TNET.

[13]  D. Abramson,et al.  An Automatic Design Optimization Tool and its Application to Computational Fluid Dynamics , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[14]  I-Hsin Chung,et al.  Using Information from Prior Runs to Improve Automated Tuning Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[15]  William T. C. Kramer,et al.  Performance Variability of Highly Parallel Architectures , 2003, International Conference on Computational Science.

[16]  Mike Kotschenreuther,et al.  Comparison of initial value and eigenvalue codes for kinetic toroidal plasma instabilities , 1995 .

[17]  Daniel A. Reed,et al.  The Autopilot Performance-Directed Adaptive Control System , 1997 .

[18]  F. Jenko,et al.  Electron temperature gradient turbulence. , 2000, Physical review letters.

[19]  Michael W. Trosset,et al.  On the Use of Direct Search Methods for Stochastic Optimization , 2000 .