Modeling and Predicting Disk I/O Time of HPC Applications

Understanding input/output (I/O) performance in high performance computing (HPC) is becoming increasingly important as the gap between the performance of computation and I/O widens. In this paper we propose a methodology to predict an application's disk I/O time while running on High Performance Computing Modernization Program (HPCMP) systems. Our methodology consists of the following steps: 1) Characterize the I/O operations of an application running on a reference system. 2) Using a configurable I/O benchmark, collect statistics on the reference and target systems about the I/O operations that are relevant to the application on the reference and target systems. 3) Calculate a ratio between the measured I/O performance of the application on the reference system, with respect to target systems to predict the application's I/O time on the target systems. Our results show that this methodology can accurately predict the I/O time of relevant HPC applications on HPCMP systems that have reasonably stable I/O performance run to run while systems that have wide variability in I/O performance are more difficult to predict accurately.

[1]  David R. O'Hallaron,et al.  //TRACE: Parallel Trace Replay with Approximate Causal Events , 2007, FAST.

[2]  Leonid Oliker,et al.  HPC global file system performance analysis using a scientific-application derived benchmark , 2009, Parallel Comput..

[3]  Michael A. Laurenzano,et al.  PSINS: An Open Source Event Tracer and Execution Simulator , 2009, HiPC 2009.

[4]  Julian Borrill MADCAP - The Microwave Anisotropy Dataset Computational Analysis Package , 1999 .

[5]  Seetharami R. Seelam,et al.  Throttling I/O Streams to Accelerate File-IO Performance , 2007, HPCC.

[6]  Seetharami R. Seelam,et al.  Modeling the Impact of Checkpoints on Next-Generation Systems , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).

[7]  Jeffrey K. Hollingsworth,et al.  An API for Runtime Code Patching , 2000, Int. J. High Perform. Comput. Appl..

[8]  Michael Laurenzano,et al.  PSINS: An Open Source Event Tracer and Execution Simulator , 2009, 2009 DoD High Performance Computing Modernization Program Users Group Conference.

[9]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[10]  J. May Pianola: A script-based I/O benchmark , 2008, 2008 3rd Petascale Data Storage Workshop.

[11]  Michael Laurenzano,et al.  PEBIL: Efficient static binary instrumentation for Linux , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[12]  John Shalf,et al.  Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.