Statistical Performance Analysis for Scientific Applications

As high-performance computing (HPC) heads towards the exascale era, application performance analysis becomes more complex and less tractable. It usually requires considerable training, experience, and a good working knowledge of hardware/software interaction to use performance tools effectively, which becomes a barrier for domain scientists. Moreover, instrumentation and profiling activities from a large run can easily generate gigantic data volume, making both data management and characterization another challenge. To cope with these, we develop a statistical method to extract the principal performance features and produce easily interpretable results. This paper introduces a performance analysis methodology based on the combination of Variable Clustering (VarCluster) and Principal Component Analysis (PCA), describes the analysis process, and gives experimental results of scientific applications on a Cray XT5 system. As a visualization aid, we use Voronoi tessellations to map the numerical results into graphical forms to convey the performance information more clearly.

[1]  Luiz De Rose,et al.  Cray Performance Analysis Tools , 2008, Parallel Tools Workshop.

[2]  Jack Dongarra,et al.  Introduction to the HPCChallenge Benchmark Suite , 2004 .

[3]  Michael Balzer,et al.  Voronoi treemaps for the visualization of software metrics , 2005, SoftVis '05.

[4]  Wu-chun Feng,et al.  The Green500 List: Encouraging Sustainable Supercomputing , 2007, Computer.

[5]  Alexandros Stamatakis,et al.  RAxML-Light: a tool for computing terabyte phylogenies , 2012, Bioinform..

[6]  Lieven Eeckhout,et al.  Microarchitecture-Independent Workload Characterization , 2007, IEEE Micro.

[7]  Jeffrey S. Vetter,et al.  Scalable Analysis Techniques for Microprocessor Performance Counter Metrics , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[8]  James Demmel,et al.  A Principled Kernel Testbed for Hardware/Software Co-Design Research , 2010 .

[9]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[10]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[11]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[12]  Brian W. Barrett,et al.  Introducing the Graph 500 , 2010 .

[13]  Daisuke Takahashi,et al.  The HPC Challenge (HPCC) benchmark suite , 2006, SC.

[14]  Nicholas J. Wright,et al.  Modeling and predicting application performance on parallel computers using HPC challenge benchmarks , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[15]  M. Chavent,et al.  ClustOfVar: An R Package for the Clustering of Variables , 2011, 1112.0295.

[16]  J. Vetter,et al.  Managing Performance Analysis with Dynamic Statistical Projection Pursuit , 2000, ACM/IEEE SC 1999 Conference (SC'99).

[17]  Jack J. Dongarra,et al.  Anatomy of a globally recursive embedded LINPACK benchmark , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[18]  H. Tufo,et al.  Computational aspects of a scalable high-order discontinuous Galerkin atmospheric dynamical core , 2009 .