论文信息 - Use of Predictive Performance Modeling during Large-scale System Installation

Use of Predictive Performance Modeling during Large-scale System Installation

In this paper we describe an important use of predictive application performance modeling - the validation of measured performance during a new large-scale system installation. Using a previously-developed and validated performance model for SAGE, a multidimensional, 3D, multi-material hydrodynamics code with adaptive mesh refinement, we were able to help guide the stabilization of the first phase of the Los Alamos ASCI Q supercomputer. We review the salient features of an analytical model for this code that has been applied to predict its performance on a large class of Tera-scale parallel systems. We describe the methodology applied during system installation and upgrades to establish a baseline for the achievable "real" performance of the system. We also show the effect on overall application performance of certain key subsystems such as PCI bus speed and multi-rail networks. We show that utilization of predictive performance models is also a powerful system debugging tool.

[1] Adolfy Hoisie,et al. Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications , 2000, Int. J. High Perform. Comput. Appl..

[2] Fabrizio Petrini,et al. Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[3] Fabrizio Petrini,et al. Using Multirail Networks in High-Performance Clusters , 2001, CLUSTER.

[4] Wu-chun Feng,et al. The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[5] Shawn D. Pautz,et al. Performance modeling of deterministic transport computations , 2004 .