Detailed Performance Analysis Using Coarse Grain Sampling

Performance evaluation tools enable analysts to shed light on how applications behave both from a general point of view and at concrete execution points, but cannot provide detailed information beyond the monitored regions of code. Having the ability to determine when and which data has to be collected is crucial for a successful analysis. This is particularly true for trace-based tools, which can easily incur either unmanageable large traces or information shortage. In order to mitigate the well-known resolution vs. usability trade-off, we present a procedure that obtains fine grain performance information using coarse grain sampling, projecting performance metrics scattered all over the execution into thoroughly detailed representative areas. This mechanism has been incorporated into the MPItrace tracing suite, greatly extending the amount of performance information gathered from statically instrumented points with further periodic samples collected beyond them. We have applied this solution to the analysis of two applications to introduce a novel performance analysis methodology based on the combination of instrumentation and sampling techniques.

[1]  P. Bézier Numerical control : mathematics and applications , 1972 .

[2]  Vernell C. Goold What Is Numerical Control , 1977 .

[3]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[4]  Rolf Hempel,et al.  The MPI Standard for Message Passing , 1994, HPCN.

[5]  Jeffrey K. Hollingsworth,et al.  An API for Runtime Code Patching , 2000, Int. J. High Perform. Comput. Appl..

[6]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[7]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[8]  Jack Dongarra,et al.  Computational Science — ICCS 2002 , 2002, Lecture Notes in Computer Science.

[9]  Shirley Moore A Comparison of Counting and Sampling Modes of Using Performance Monitoring Hardware , 2002, International Conference on Computational Science.

[10]  Felix Wolf,et al.  KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications , 2003 .

[11]  Thomas F. Wenisch,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.

[12]  Ronald H. Perrott,et al.  Demonstrations of Parallel and Distributed Computing , 2003, Euro-Par.

[13]  Erik Hagersten,et al.  THROOM — Supporting POSIX Multithreaded Binaries on a Cluster , 2003 .

[14]  K. Chung MATHEMATICS AND APPLICATIONS , 2004 .

[15]  Michael Stumm,et al.  Online performance analysis by statistical sampling of microprocessor performance counters , 2005, ICS '05.

[16]  F. Trochu A contouring program based on dual kriging interpolation , 1993, Engineering with Computers.

[17]  Jeffrey K. Hollingsworth,et al.  Using Dynamic Tracing Sampling to Measure Long Running Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[18]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[19]  K. Ekanadham,et al.  baraglia pSigma : An Infrastructure for Parallel Application Performance Analysis using Symbolic Specifications , 2006 .

[20]  Anne-Marie Kermarrec,et al.  Proceedings of the 13th European international conference on Parallel Processing , 2007 .

[21]  Toni Cortes,et al.  PARAVER: A Tool to Visualize and Analyze Parallel Code , 2007 .

[22]  Jesús Labarta,et al.  Automatic Structure Extraction from MPI Applications Tracefiles , 2007, Euro-Par.

[23]  Patricia J. Teller,et al.  Proceedings of the 2008 ACM/IEEE conference on Supercomputing , 2008, HiPC 2008.

[24]  Juan Gonzalez,et al.  Automatic detection of parallel applications computation phases , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[25]  Guillaume Houzeaux,et al.  Experience in Parallel Computational Mechanics on MareNostrum , 2009 .

[26]  Wolfgang E. Nagel,et al.  VAMPIR: Visualization and Analysis of MPI Resources , 2010 .