Modeling application performance by convolving machine signatures with application profiles

This paper presents a performance modeling methodology that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on a class of High Performance Computing benchmarks. The method yields insight into the factors that affect performance on single-processor and parallel computers.

[1]  Sharon E. Perl Performance assertion checking , 1993, SOSP '93.

[2]  Ian Foster,et al.  Performance of parallel computers for spectral atmospheric models , 1995 .

[3]  Ian T. Foster,et al.  Parallel Algorithms for the Spectral Transform Method , 1997, SIAM J. Sci. Comput..

[4]  Dean M. Tullsen,et al.  Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading , 1997, TOCS.

[5]  Daniel A. Reed,et al.  Integrated compilation and scalability analysis for parallel systems , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[6]  Ying Zhang,et al.  SvPablo: A Multi-language Performance Analysis System , 1998, Computer Performance Evaluation.

[7]  The International Journal of High Performance Computing Applications— , 1998 .

[8]  Sharad Malik,et al.  Cache miss equations: a compiler framework for analyzing and tuning memory behavior , 1999, TOPL.

[9]  Daniel A. Reed,et al.  SvPablo: A multi-language architecture-independent performance analysis system , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[10]  D. Keyes,et al.  Toward Realistic Performance Bounds for Implicit CFD , 1999 .

[11]  P. H. Worley Impact of Communication Protocol on Performance , 1999 .

[12]  Mark Heinrich,et al.  FLASH vs. (simulated) FLASH: closing the simulation loop , 2000, SIGP.

[13]  Jeffrey K. Hollingsworth,et al.  An API for Runtime Code Patching , 2000, Int. J. High Perform. Comput. Appl..

[14]  Jeffrey K. Hollingsworth,et al.  Resource-aware meta-computing , 2000, Adv. Comput..

[15]  Patrick H. Worley,et al.  Performance evaluation of the IBM SP and the Compaq AlphaServer SC , 2000, ICS '00.

[16]  Jeffrey S. Vetter Performance analysis of distributed applications using automatic classification of communication inefficiencies , 2000, ICS '00.

[17]  Dee A. B. Weikle,et al.  Caches as filters: a framework for the analysis of caching systems , 2001 .