Approaches to parallel performance prediction

Designing parallel programs is both interesting and difficult. The reason for using a parallel machine is to obtain better performance, but the programmer will have little idea of the performance of a program at design time, and will only find out by actually running it. Design decisions have to be be made by guesswork alone. This thesis explores an alternative by providing data sheets describing the performance of parallel building blocks, and then seeing how they may be used in practice. The simplest way of using the data sheets is based on a graphing and equation plotting tool. More detailed design information is available from a "reverse" profiling technique which adapts standard profiling to generate predictions rather than measurements. The ultimate method for prediction is based on discrete event simulation, which allows modelling of all programs but is the most complex to use. The methods are compared, and their suitability for different design problems is discussed.

[1]  Roland N. Ibbett,et al.  HASE: A Flexible Toolset for Computer Architects , 1995, Comput. J..

[2]  Robert W. Numrich,et al.  Measurement of Communication Rates on the Cray T3D Interprocessor Network , 1994, HPCN.

[3]  Anant Agarwal,et al.  Limits on Interconnection Network Performance , 1991, IEEE Trans. Parallel Distributed Syst..

[4]  Rob Pooley,et al.  Object-Oriented Database Technology Applied to Distributed Simulation , 1995, EUROSIM.

[5]  Faron Moller,et al.  A Temporal Calculus of Communicating Systems , 1990, CONCUR.

[6]  Eric A. Brewer,et al.  PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.

[7]  John A. Nestor,et al.  Visual register-transfer description of VLSI microarchitectures , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[8]  John L. Hennessy,et al.  Finding and Exploiting Parallelism in an Ocean Simulation Program: Experience, Results, and Implications , 1992, J. Parallel Distributed Comput..

[9]  Alan C. Shaw,et al.  Experiments with a program timing tool based on source-level timing schema , 1990, [1990] Proceedings 11th Real-Time Systems Symposium.

[10]  Fred W. Howell Reverse profiling , 1996, Software Engineering for Parallel and Distributed Systems.

[11]  D.A. Reed,et al.  Scalable performance analysis: the Pablo performance analysis environment , 1993, Proceedings of Scalable Parallel Libraries Conference.

[12]  Michael A. Driscoll,et al.  Accurate Predictions of Parallel Program Execution Time , 1995, J. Parallel Distributed Comput..

[13]  Kwei-Jay Lin,et al.  Building flexible real-time systems using the Flex language , 1991, Computer.

[14]  Sivan Toledo,et al.  Quantitative performance modeling of scientific computations and creating locality in numerical algorithms , 1995 .

[15]  Roland N. Ibbett,et al.  Hierarchical Architecture Design and Simulation Environment , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[16]  Murray Cole,et al.  Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .

[17]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[18]  Satish K. Tripathi,et al.  Performance prediction of parallel computation , 1994, Proceedings of 8th International Parallel Processing Symposium.

[19]  Stephen F. Lundstrom,et al.  Predicting Performance of Parallel Computations , 1990, IEEE Trans. Parallel Distributed Syst..

[20]  David M. Nicol,et al.  A distributed memory LAPSE: parallel simulation of message-passing programs , 1994, PADS '94.

[21]  Eric A. Brewer,et al.  Developing parallel applications using high-performance simulation , 1993, PADD '93.

[22]  David Kotz,et al.  A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet , 1996 .

[23]  J. Paris,et al.  Parallel Application Design: The Simulation Approach with HASTE , 1994, HPCN.

[24]  Alan D. George Simulating Microprocessor-Based Parallel Computers Using Processor Libraries , 1993, Simul..

[25]  Roland N. Ibbett,et al.  HASE: a flexible high performance architecture simulator , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[26]  John L. Hennessy,et al.  Multiprocessor Simulation and Tracing Using Tango , 1991, ICPP.

[27]  James R. Larus,et al.  The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.

[28]  Ian Gorton,et al.  The PARSE project , 1996, Software Engineering for Parallel and Distributed Systems.

[29]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[30]  Eugene Miya,et al.  Machine Characterization Based on an Abstract High-level Language Machine , 1990, PERV.

[31]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[32]  Günter Haring,et al.  Performance prediction of parallel systems with scalable specifications—methodology and case study , 1995, PERV.

[33]  N. B. MacDonald Predicting Execution Times of Sequential Scientific Kernels , 1994, Automatic Parallelization.

[34]  Diane P. Bischak,et al.  Object-oriented simulation , 1991, 1991 Winter Simulation Conference Proceedings..

[35]  Ian Foster,et al.  Designing and building parallel programs , 1994 .

[36]  David A. Wood,et al.  Accuracy vs. performance in parallel simulation of interconnection networks , 1995, Proceedings of 9th International Parallel Processing Symposium.

[37]  Jane Hillston A Tool to Enhance Model Exploitation , 1995, Perform. Evaluation.

[38]  Harry F. Jordan,et al.  Report of the Purdue Workshop on Grand Challenges in Computer Architecture for the Support of High Performance Computing , 1992, J. Parallel Distributed Comput..