Writing efficient parallel programs for a massively parallel system like the CRAY T3E is still a difficult task because such programs are typically very large and complex, not trivially parallelizable and their dynamic behavior is difficult to understand or predict. Therefore, runtime performance analysis tools are needed on such systems in addition to the normal programming environment tools like editors and debuggers. For the CRAY T3E, Silicon Graphics/Cray Research implemented and provides two performance analysis tools, Apprentice and PAT. Apprentice is a profiling tool which uses source code instrumentation through compiler switches and provides statistics on the level of functions and basic blocks. PAT, the Performance Analysis Tool, is actually several tools in one. It provides profiling through sampling and access to hardware performance information. It also includes an object code instrumenter which can be used for detailed call site profiling and gathering of function level hardware performance statistics. In a collaboration between Silicon Graphics/Cray Research and Forschungszentrum Julich, PAT was extended to also support event tracing. In this paper, we describe how the new extended PAT and VAMPIR, an event trace browser developed by Forschungszentrum Julich, can be used to analyze message passing programs on the CRAY T3E. The powerful trace browsing features of VAMPIR make it a perfect extension to PAT’s object instrumentation and tracing functionality. First, the features of PAT are described in detail. In order to analyze message passing programs, the message passing libraries of the CRAY T3E (MPI, PVM, and SHMEM) needed to be instrumented. This feature is described next. We then give an overview of VAMPIR and it’s functionality. With two small examples, we show how the combination of PAT’s object instrumentation features, the new message passing function wrapper library, and VAMPIR’s trace displays can be used to analyze message passing programs on the CRAY T3E to any detail.
[1]
Daniel A. Reed,et al.
Scalable Performance Environments for Parallel Systems
,
1991,
The Sixth Distributed Memory Computing Conference, 1991. Proceedings.
[2]
Ewing Lusk,et al.
Studying parallel program behavior with upshot
,
1991
.
[3]
B. Miller,et al.
The Paradyn Parallel Performance Measurement Tools
,
1995
.
[4]
Michael Gerndt,et al.
Performance Analysis for SVM-Fortran with OPAL
,
1995,
PDPTA.
[5]
Pankaj Mehra,et al.
Performance measurement, visualization and modeling of parallel and distributed programs using the AIMS toolkit
,
1995,
Softw. Pract. Exp..