MPI Trace Compression Using Event Flow Graphs

Understanding how parallel applications behave is crucial for using high-performance computing (HPC) resources efficiently. However, the task of performance analysis is becoming increasingly difficult due to the growing complexity of scientific codes and the size of machines. Even though many tools have been developed over the past years to help in this task, current approaches either only offer an overview of the application discarding temporal information, or they generate huge trace files that are often difficult to handle.

[1]  Dirk Schmidl,et al.  Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir , 2011, Parallel Tools Workshop.

[2]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[3]  Jesús Labarta,et al.  Framework for a productive performance optimization , 2013, Parallel Comput..

[4]  Martin Schulz,et al.  ScalaTrace: Scalable compression and replay of communication traces for high-performance computing , 2008, J. Parallel Distributed Comput..

[5]  Nicholas J. Wright,et al.  Effective Performance Measurement at Petascale Using IPM , 2010, 2010 IEEE 16th International Conference on Parallel and Distributed Systems.

[6]  Wolfgang E. Nagel,et al.  Construction and compression of complete call graphs for post-mortem program trace analysis , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[7]  Sriram Krishnamoorthy,et al.  Scalable Communication Trace Compression , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[8]  Susan L. Graham,et al.  gprof: a call graph execution profiler (with retrospective) , 1982 .

[9]  Erwin Laure,et al.  Online Performance Data Introspection with IPM , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[10]  Ken Kennedy,et al.  An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..

[11]  Jeffrey S. Vetter,et al.  Statistical scalability analysis of communication operations in distributed applications , 2001, PPoPP '01.

[12]  Jesús Labarta,et al.  John von Neumann Institute for Computing Scalability of Visualization and Tracing Tools , 2022 .

[13]  Toni Cortes,et al.  PARAVER: A Tool to Visualize and Analyze Parallel Code , 2007 .

[14]  David Skinner,et al.  Capturing and Visualizing Event Flow Graphs of MPI Applications , 2009, Euro-Par Workshops.

[15]  Matthias S. Müller,et al.  The Vampir Performance Analysis Tool-Set , 2008, Parallel Tools Workshop.