Scalable Event Trace Visualization

Parallel event trace visualizations can aid in discovery of the root causes of certain performance problems on high-end systems. However, traditional trace visualizations are not inherently scalable and require considerable effort on the part of the user to identify similarities and differences in performance across parallel entities. In this work, we evaluate several methods for deciding when traces of different processes in a run are similar enough that only one of the traces needs to be retained and rendered in the visualization. We show visualizations of reduced traces and evaluate them for compression, error, and retention of correct diagnostic information.

[1]  Bernd Mohr,et al.  Scalable performance visualization for data-parallel programs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[2]  Jesús Labarta,et al.  A trace-scaling agent for parallel application tracing , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[3]  Martin Schulz,et al.  Scalable compression and replay of communication traces in massively parallel environments , 2006, SC.

[4]  F. Mueller,et al.  Scalable Compression and Replay of Communication Traces in Massively P arallel E nvironments , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[5]  Philip C. Roth,et al.  Real-Time Statistical Clustering for Event Trace Reduction , 1997, Int. J. High Perform. Comput. Appl..

[6]  Patricia J. Teller,et al.  A systematic multi-step methodology for performance analysis of communication traces of distributed applications based on hierarchical clustering , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[7]  Karen L. Karavanic,et al.  Towards Scalable Event Tracing for High End Systems , 2007, HPCC.

[8]  William Gropp,et al.  Toward Scalable Performance Visualization with Jumpshot , 1999, Int. J. High Perform. Comput. Appl..

[9]  Jack Dongarra,et al.  An algebra for cross-experiment performance analysis , 2004 .

[10]  Laxmikant V. Kalé,et al.  Towards scalable performance analysis and visualization through data reduction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[11]  Martin Schulz,et al.  Preserving time in large-scale communication traces , 2008, ICS '08.

[12]  Karen L. Karavanic,et al.  Evaluating similarity-based trace reduction techniques for scalable performance analysis , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[13]  Barton P. Miller,et al.  Dynamic program instrumentation for scalable performance tools , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[14]  Michael T. Heath,et al.  Visualizing the performance of parallel programs , 1991, IEEE Software.

[15]  Bernd Mohr,et al.  Scalable Parallel Trace-Based Performance Analysis , 2006, PVM/MPI.

[16]  Bernd Mohr,et al.  A test suite for parallel performance analysis tools , 2007, Concurr. Comput. Pract. Exp..

[17]  Oscar Naim,et al.  Visualization of Do-Loop Performance , 1997, HPCN Europe.

[18]  Wolfgang E. Nagel,et al.  Visualization of Repetitive Patterns in Event Traces , 2006, PARA.

[19]  Barton P. Miller What to Draw? When to Draw? An Essay on Parallel Program Visualization , 1993, J. Parallel Distributed Comput..

[20]  R. W. Hamming State of the art in scientific computing , 1963, AFIPS '63 (Spring).

[21]  Jack Dongarra,et al.  Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, Dublin, Ireland, September 7-10, 2008. Proceedings , 2008, PVM/MPI.

[22]  Michael T. Heath,et al.  The Visual Display of Parallel Performance Data , 1995, Computer.