Interpretive performance prediction for high performance parallel computing

The key factor contributing to the complexity of parallel application development and the poor utilization of current high performance computing (HPC) systems is the increased degrees of freedom that have to be resolved in such an environment. The primary objective of our research is to address this software development bottleneck. In this research we develop the interpretive approach to performance prediction. The essence of this approach is the application of interpretation techniques to performance prediction through an appropriate characterization of the HPC system and the application. A comprehensive system characterization methodology is defined to hierarchically abstract the HPC system into a set of parameters which represent its performance. A corresponding application characterization methodology is defined to abstract a high-level application description into a set of parameters which represent its behavior. Performance prediction is then achieved by interpreting the execution costs of the abstracted application in terms of the parameters exported by the abstracted system. Models and heuristics are defined to handle accesses to the memory hierarchy, overlap between computation and communication, and user experimentation with system and run-time parameters. This thesis concentrates on distributed memory HPC systems and uses such a system to illustrate and validate the developed approach. An interpretive toolkit is designed and implemented to support HPF/Fortran 90D application development. It incorporates the following three systems: (1) ESP: An Interpretive Framework for HPF/Fortran 90D Performance Prediction; (2) ESP-i: A HPF/Fortran 90D Functional Interpreter; and (3) ESPial: An Integrated Environment for HPF/Fortran 90D Application Development & Execution. The toolkit is supported by an interactive, graphical user interface (ESPView) and provides the developer with the following functionality: design evaluation capability, functional verification capability, performance visualization support, experimentation capability, compilation support, and execution support. A set of application codes and benchmarking kernels are used to validate the accuracy, utility, cost-effectiveness, and usability of the interpretive framework. The interpretive approach provides an accurate and cost-effective (in terms of time and resources required) evaluation methodology that can be used by any tool supporting HPC (e.g. intelligent compilers, mapping and load-balancing tools, and system design evaluation tools) which has to optimize available design options.

[1]  Bernd Mohr,et al.  SIMPLE: A Performance Evaluation Tool Environment for Parallel and Distributed Systems , 1991, EDMCC.

[2]  Walter J. Karplus,et al.  Performance evaluation and prediction for large heterogeneous distributed systems , 1988 .

[3]  Daniel P. Siewiorek,et al.  Performance Prediction and Calibration for a Class of Multiprocessors , 1988, IEEE Trans. Computers.

[4]  Milos D. Ercegovac,et al.  A methodology for performance analysis of parallel computations with looping constructs , 1992 .

[5]  Marco Ajmone Marsan,et al.  Product-Form Solution Techniques for the Performance Analysis of Multiple-Bus Multiprocessor Systems with Nonuniform Memory References , 1988, IEEE Trans. Computers.

[6]  Dennis Gannon,et al.  The characteristics of parallel algorithms , 1987 .

[7]  F. André,et al.  SiGLe : An Evaluation Tool for Distributed Systems , 1987, ICDCS.

[8]  Ronald W. Wolff,et al.  Stochastic Modeling and the Theory of Queues , 1989 .

[9]  Ken Kennedy,et al.  A static performance estimator in the Fortran D programming system , 1992 .

[10]  Geoffrey C. Fox,et al.  Applications Benchmark Set for Fortran-D and High Performance Fortran , 1992 .

[11]  Daniel J. Pease Parallel Assessment Window System (PAWS) Enhancements , 1993 .

[12]  Graham R. Nudd,et al.  A Layered Approach to Parallel Software Performance Prediction: A Case Study , 1994, EUROSIM.

[13]  D. Haban,et al.  Monitoring and performance measuring distributed systems during operation , 1988, SIGMETRICS 1988.

[14]  Steven A. Moyer,et al.  Performance of the IPSC/860 Node Architecture , 1991 .

[15]  M. Gupta,et al.  Compile-time estimation of communication costs in multicomputers. Technical report , 1991 .

[16]  Lucian Russell,et al.  Software development issues for parallel processing , 1988, Proceedings COMPSAC 88: The Twelfth Annual International Computer Software & Applications Conference.

[17]  Jack Dongarra,et al.  SCHEDULE: Tools for developing and analyzing parallel Fortran programs , 1986 .

[18]  Ralph Duncan,et al.  A survey of parallel computer architectures , 1990, Computer.

[19]  Connie U. Smith,et al.  Performance Engineering of Software Systems , 1990, SIGMETRICS Perform. Evaluation Rev..

[20]  Sanjay Ranka,et al.  Software Issues and Performance of a Parallel Model for Stock Option Pricing , 1992 .

[21]  Jerrold L. Wagener,et al.  Fortran 90 Handbook: Complete Ansi/Iso Reference , 1992 .

[22]  Geoffrey C. Fox,et al.  An Interpretive Framework for Application Performance Prediction , 1993 .

[23]  Barton P. Miller,et al.  IPS-2: The Second Generation of a Parallel Program Measurement System , 1990, IEEE Trans. Parallel Distributed Syst..

[24]  Dennis Gannon,et al.  Faust: an environment for programming parallel scientific applications , 1988, Proceedings. SUPERCOMPUTING '88.

[25]  Milind Girkar,et al.  Parafrase-2: an Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors , 1989, Int. J. High Speed Comput..

[26]  Victor W. K. Mak,et al.  Performance prediction of concurrent systems , 1987 .

[27]  Allen D. Malony,et al.  Faust: an integrated environment for parallel programming , 1989, IEEE Software.

[28]  Gregory F. Pfister,et al.  A Methodology for Predicting Multiprocessor Performance , 1985, International Conference on Parallel Processing.

[29]  Wolfgang K. Giloi,et al.  Very high-speed communication in large MIMD supercomputers , 1989, ICS '89.

[30]  Thomas Fahringer,et al.  A static parameter based performance prediction tool for parallel programs , 1993, ICS '93.

[31]  Markus Siegle,et al.  Monitoring program behaviour on SUPRENUM , 1992, ISCA '92.

[32]  David B. Skillicorn A taxonomy for computer architectures , 1988, Computer.

[33]  C. V. Ramamoorthy,et al.  Extensions on performance evaluation techniques for concurrent systems , 1988, Proceedings COMPSAC 88: The Twelfth Annual International Computer Software & Applications Conference.

[34]  Daniel A. Menascé,et al.  A Methodology for Performance Evaluation of Parallel Applications on Multiprocessors , 1992, J. Parallel Distributed Comput..

[35]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[36]  John H. Reif,et al.  Prototyping parallel and distributed programs in Proteus , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[37]  Gordon Bell,et al.  Ultracomputers: a teraflop before its time , 1992, CACM.

[38]  J. Robert Jump,et al.  The rice parallel processing testbed , 1988, SIGMETRICS '88.

[39]  F. Darema,et al.  Parallel applications performance methodology , 1989 .

[40]  James L. Peterson,et al.  Petri Nets , 1977, CSUR.

[41]  Pangfeng Liu,et al.  Abstractions for parallel N-body simulations , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[42]  James R. Larus,et al.  Rewriting executable files to measure program behavior , 1994, Softw. Pract. Exp..

[43]  Alan Jay Smith,et al.  Machine Characterization Based on an Abstract High-Level Language Machine , 1989, IEEE Trans. Computers.

[44]  David A. Poplawski Synthetic Models of Distributed-Memory Parallel Programs , 1991, J. Parallel Distributed Comput..

[45]  Geoffrey C. Fox,et al.  Compiling HPF for Distributed Memory MIMD Computers , 1994 .

[46]  K. M. Deckery Parallel Computing in the 1990's: Attacking the Software Problem , 1991 .

[47]  Geoffrey C. Fox,et al.  Parallel Computing Works , 1994 .

[48]  Randy Pausch A Tutorial for SUIT: The Simple User Interface Toolkit , 1990 .

[49]  Ken Kennedy,et al.  Fortran D Language Specification , 1990 .

[50]  Michael Gerndt,et al.  SUPERB: A tool for semi-automatic MIMD/SIMD parallelization , 1988, Parallel Comput..

[51]  John R. Rice,et al.  Problems to Test Parallel and Vector Languages -- II , 1990 .

[52]  Louis H. Turcotte,et al.  A Survey of Software Environments for Exploiting Networked Computing Resources , 1993 .

[53]  Alex Y. Kwok,et al.  Performance prediction tools for Cedar: a multiprocessor supercomputer , 1985, ISCA 1985.

[54]  Ken Kennedy,et al.  The parascope editor: an interactive parallel programming tool , 1993, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[55]  Jane,et al.  CHARACTERISING COMPUTATIONAL KERNELS TO PREDICT PERFORMANCE ON PARALLEL SYSTEMS PROCEEDINGS OF THE 1994 WORLD TRANSPUTER CONGRESS , 1994 .

[56]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[57]  Reda A. Ammar,et al.  A technique to derive the detailed time costs of parallel computations , 1988, Proceedings COMPSAC 88: The Twelfth Annual International Computer Software & Applications Conference.

[58]  James C. Browne,et al.  CODE: a unified approach to parallel programming , 1989, IEEE Software.

[59]  Alok Choudhary,et al.  Runtime compilation techniques for data partitioning and communication schedule reuse , 1993, Supercomputing '93.

[60]  Marc Abrams,et al.  Chitra: Visual Analysis of Parallel and Distributed Programs in the Time, Event, and Frequency Domains , 1992, IEEE Trans. Parallel Distributed Syst..

[61]  Lorenzo Alvisi,et al.  Paralex: an environment for parallel programming in distributed systems , 1991, ICS '92.

[62]  Karsten M. Decker,et al.  Programming Environments for Massively Parallel Distributed Systems , 1994, Monte Verità.

[63]  Kishor S. Trivedi,et al.  Analytic Queueing Models for Programs with Internal Concurrency , 1983, IEEE Transactions on Computers.

[64]  Arif Ghafoor,et al.  PAWS: a performance evaluation tool for parallel computing systems , 1991, Computer.

[65]  Des Watson High-level languages and their compilers , 1989, International computer science series.

[66]  Ken Kennedy,et al.  An Interactive Environment for Data Partitioning and Distribution , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[67]  Franz Sötz,et al.  A Method for Performance Prediction of Parallel Programs , 1990, CONPAR.

[68]  Grady Booch,et al.  Software engineering with Ada , 1983 .

[69]  Dennis Gannon,et al.  Performance evaluation and prediction for parallel algorithms on the BBN GP1000 , 1990, ICS '90.

[70]  Benjamin W. Wah,et al.  Intelligent mapping of communicating processes in distributed computing systems , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[71]  Jerome Alexander Rolia,et al.  Predicting the performance of software systems , 1992 .

[72]  Reinhold Kröger,et al.  JEWEL: Design and Implementation of a Distributed Measurement System , 1992, IEEE Trans. Parallel Distributed Syst..

[73]  Sussman Execution models for mapping programs onto distributed-memory parallel computers. Final Report , 1992 .