Mapping performance data for high-level and data views of parallel program performance

Programs written inhigh-level parallel languages need profiling tools that provide performance data in terms of the semantics of the high-level language. But high-level performance data can be incomplete when the cause of a performance problem camot be explained in terms of the semantics of the language. We also need the ability to view the performance of the underlying mechanisms used by the language and correlate the underlying activity to the language source code. The key techniques for providing these performance views is the ability to map low-level performance data up to the language abstractions. We describe how we use this information to produce performance data at the higher levels, and how we present this data in terms of both the code and parallel data structures. We have developed an implementation of these mapping techniques for the data parallel CM Fortran language running on the TMC CM-5. We have augmented the Paradyn Parallel Performance TOOIS with these mapping and high-level language facilities and used them to study several real data parallel Fortran (CM Fortran) applications.

[1]  Joan M. Francioni,et al.  Breaking the Silence: Auralization of Parallel Program Behavior , 1993, J. Parallel Distributed Comput..

[2]  Barton P. Miller,et al.  IPS: An Interactive and Automatic Performance Measurement Tool for Parallel and Distributed Programs , 1987, ICDCS.

[3]  Michael T. Heath,et al.  Visualizing the performance of parallel programs , 1991, IEEE Software.

[4]  MartonosiMargaret,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992 .

[5]  Barton P. Miller,et al.  A Performance Tool for High-Level Parallel Programming Languages , 1994 .

[6]  Winifred Williams,et al.  The MPP Apprentice™ Performance Tool: Delivering the Performance of the Cray T3D® , 1994 .

[7]  C. Brebbia,et al.  A new approach to free vibration analysis using boundary elements , 1983 .

[8]  Barton P. Miller,et al.  A distributed programs monitor for berkeley UNIX , 1985, Softw. Pract. Exp..

[9]  Edith Schonberg,et al.  Visualizing the execution of High Performance Fortran (HPF) programs , 1995, Proceedings of 9th International Parallel Processing Symposium.

[10]  Robert T. Schumacher Analysis of aperiodicities in nearly periodic waveforms , 1992 .

[11]  John L. Hennessy,et al.  Performance debugging shared memory multiprocessor programs with MTOOL , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[12]  Karsten Schwan,et al.  ChaosMON—application-specific monitoring and display of performance information for parallel and distributed systems , 1991, PADD '91.

[13]  HelmRichard,et al.  Visualizing the behavior of object-oriented systems , 1993 .

[14]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[15]  Barton P. Miller,et al.  The integration of application and system based metrics in a parallel program performance tool , 1991, PPOPP '91.

[16]  Leonardo Dagum,et al.  Three-dimensional direct particle simulation on the Connection Machine , 1991 .

[17]  B. Miller,et al.  The Paradyn Parallel Performance Measurement Tools , 1995 .

[18]  Ewing Lusk,et al.  User''s Guide to the p4 Parallel Programming System , 1992 .

[19]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[20]  Theodore F. Lehr MKM : Mach Kernel Monitor description, examples and measurements , 1989 .

[21]  Jerry C. Yan Performance Tuning with AIMS - An Automated Instrumentation and Monitoring System for Multicomputers , 1994, HICSS.

[22]  John Glauert,et al.  SISAL: streams and iteration in a single-assignment language. Language reference manual, Version 1. 1 , 1983 .

[23]  Michael F. Kleyn,et al.  GraphTrace - Understanding Object-Oriented Systems Using Concurrently Animated Views , 1988, OOPSLA.

[24]  D.A. Reed,et al.  Scalable performance analysis: the Pablo performance analysis environment , 1993, Proceedings of Scalable Parallel Libraries Conference.

[25]  Joel H. Saltz,et al.  PARTI primitives for unstructured and block structured problems , 1992 .

[26]  Ilya Gertner,et al.  Non-intrusive and interactive profiling in parasight , 1988, PPEALS '88.

[27]  Shreekant S. Thakkar Performance of parallel applications on a shared-memory multiprocessor system , 1990 .

[28]  Doug Kimelman,et al.  Visualizing the behavior of object-oriented systems , 1993, OOPSLA '93.

[29]  Michael F. Kleyn,et al.  GraphTrace—understanding object-oriented systems using concurrently animated views , 1988, OOPSLA '88.

[30]  Barton P. Miller,et al.  Dynamic program instrumentation for scalable performance tools , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[31]  Barton P. Miller,et al.  Dynamic control of performance monitoring on large scale parallel systems , 1993, ICS '93.

[32]  Ian T. Foster,et al.  Productive Parallel Programming: The PCN Approach , 1995, Sci. Program..

[33]  Larry Rudolph,et al.  PIE: A Programming and Instrumentation Environment for Parallel Processing , 1985, IEEE Software.

[34]  Arvind,et al.  Programming Generality and Parallel Computers , 1988 .

[35]  Thomas J. Leblanc,et al.  Analyzing Parallel Program Executions Using Multiple Views , 1990, J. Parallel Distributed Comput..

[36]  Henri E. Bal,et al.  Programming languages for distributed computing systems , 1989, CSUR.

[37]  D. W. Peaceman Fundamentals of numerical reservoir simulation , 1977 .

[38]  P. Zellweger An interactive high-level debugger for control-flow optimized programs , 1983, SIGSOFT '83.

[39]  John L. Hennessy,et al.  Symbolic Debugging of Optimized Code , 1982, TOPL.

[40]  Volker Haarslev,et al.  A framework for visualizing object-oriented systems , 1990, OOPSLA/ECOOP '90.

[41]  David A. Wood,et al.  Cache profiling and the SPEC benchmarks: a case study , 1994, Computer.

[42]  Barton P. Miller,et al.  IPS-2: The Second Generation of a Parallel Program Measurement System , 1990, IEEE Trans. Parallel Distributed Syst..

[43]  H. H. Rachford,et al.  The Numerical Solution of Parabolic and Elliptic Differential Equations , 1955 .

[44]  Dennis Gannon,et al.  Sage++: An Object-Oriented Toolkit and Class Library for Building Fortran and C++ Restructuring Tool , 1994 .

[45]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[46]  Steve Simmons,et al.  A new approach to debugging optimized code , 1992, PLDI '92.

[47]  Deborah S. Coutant,et al.  DOC: a practical approach to source-level debugging of globally optimized code , 1988, PLDI '88.

[48]  Margaret Martonosi,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.

[49]  D.A. Reed,et al.  An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[50]  Don Allen,et al.  Data Visualization and Performance Analysis in the Prism Programming Environment , 1992, Programming Environments for Parallel Computing.

[51]  Bernd Mohr,et al.  TAU: A Portable Parallel Program Analysis Environment for pC++ , 1994, CONPAR.

[52]  Richard T. Snodgrass,et al.  A relational approach to monitoring complex systems , 1988, TOCS.

[53]  Leonardo Dagum,et al.  Three dimensional particle simulation of high altitude rocket plumes , 1992 .

[54]  Laura B. Linden Parallel program visualization using ParVis , 1990 .

[55]  James R. Larus,et al.  Rewriting executable files to measure program behavior , 1994, Softw. Pract. Exp..

[56]  Monica S. Lam,et al.  Coarse-grain parallel programming in Jade , 1991, PPOPP '91.

[57]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[58]  C WiledenJack,et al.  An approach to high-level debugging of distributed systems , 1983 .