Detecting Application Load Imbalance on High End Massively Parallel Systems

Scientific applications should be well balanced in order to achieve high scalability on current and future high end massively parallel systems. However, the identification of sources of load imbalance in such applications is not a trivial exercise, and the current state of the art in performance analysis tools do not provide an efficient mechanism to help users to identify the main areas of load imbalance in an application. In this paper we discuss a new set of metrics that we defined to identify and measure application load imbalance. We then describe the extensions that were made to the Cray performance measurement and analysis infrastructure to detect application load imbalance and present to the user in an insightful way.

[1]  Jesús Labarta,et al.  Analyzing Scheduling Policies Using Dimemas , 1997, Parallel Comput..

[2]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[3]  Daniel A. Reed,et al.  SvPablo: A multi-language architecture-independent performance analysis system , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[4]  Jeffrey K. Hollingsworth,et al.  SIGMA: A Simulator Infrastructure to Guide Memory Analysis , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[5]  Luiz De Rose The Hardware Performance Monitor Toolkit , 2001, Euro-Par.

[6]  Andrea Clematis,et al.  Evolutions in parallel distributed and network-based processing , 2003, Journal of systems architecture.

[7]  Robert J. Fowler,et al.  HPCVIEW: A Tool for Top-down Analysis of Node Performance , 2002, The Journal of Supercomputing.

[8]  B. Miller,et al.  The Paradyn Parallel Performance Measurement Tools , 1995 .

[9]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[10]  Daniel A. Reed,et al.  An approach to immersive performance visualization of parallel and wide-area distributed applications , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[11]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[12]  Rizos Sakellariou,et al.  Euro-Par 2001 Parallel Processing , 2001, Lecture Notes in Computer Science.

[13]  Michael Voss,et al.  VGV: supporting performance analysis of object-oriented mixed MPI/OpenMPI parallel applications , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[14]  William Gropp,et al.  From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[15]  Jesús Labarta,et al.  A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[16]  Erik Hagersten,et al.  THROOM — Supporting POSIX Multithreaded Binaries on a Cluster , 2003 .

[17]  Allen D. Malony,et al.  ParaProf: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis , 2003, Euro-Par.