IOPin: Runtime Profiling of Parallel I/O in HPC Systems

Many I/O- and data-intensive scientific applications use parallel I/O software to access files in high performance. On modern parallel machines, the I/O software consists of several layers, including high-level libraries such as Parallel netCDF and HDF, middleware such as MPI-IO, and low-level POSIX interface supported by the file systems. For the I/O software developers, ensuring data flow is important among these software layers with performance close to the hardware limits. This task requires understanding the design of individual libraries and the characteristics of data flow among them. In this paper, we propose a dynamic instrumentation framework that can be used to understand the complex interactions across different I/O layers from applications to the underlying parallel file systems. Our preliminary experience indicates that the costs of using the proposed dynamic instrumentation is about 7% of the application execution time.

[1]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[2]  Carla Schlatter Ellis,et al.  File-Access Characteristics of Parallel Scientific Workloads , 1996, IEEE Trans. Parallel Distributed Syst..

[3]  Oscar Naim,et al.  MDL: a language and compiler for dynamic program instrumentation , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.

[4]  William Gropp,et al.  Mpi - The Complete Reference: Volume 2, the Mpi Extensions , 1998 .

[5]  William Gropp,et al.  User's Guide for MPE: Extensions for MPI Programs , 1998 .

[6]  William Gropp,et al.  MPI: The Complete Reference , Vol. 2 - The MPI-2 Extensions , 1998 .

[7]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[8]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[9]  Gregory R. Ganger,et al.  Designing computer systems with MEMS-based storage , 2000, ASPLOS.

[10]  Jack J. Dongarra,et al.  Review of Performance Analysis Tools for MPI Parallel Programs , 2001, PVM/MPI.

[11]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[12]  James R. McGraw,et al.  Proceedings of the 2003 ACM/IEEE conference on Supercomputing , 2003 .

[13]  Andrew J. Hutton,et al.  Lustre: Building a File System for 1,000-node Clusters , 2003 .

[14]  Robert J. Fowler,et al.  HPCToolkit : Multi-platform Tools for Profile-based Performance Analysis , 2003 .

[15]  Robert B. Ross,et al.  Noncontiguous I/O accesses through MPI-IO , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[16]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[17]  Marianne Winslett,et al.  Improving MPI-IO output performance with active buffering plus threads , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[18]  Alan Eustace,et al.  ATOM - A System for Building Customized Program Analysis Tools , 1994, PLDI.

[19]  Wei-keng Liao,et al.  Scalable high-level caching for parallel I/O , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[20]  Derek Bruening,et al.  Efficient, transparent, and comprehensive runtime code manipulation , 2004 .

[21]  Bernd Mohr,et al.  A Scalable Approach to MPI Application Performance Analysis , 2005, PVM/MPI.

[22]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[23]  Mary Lou Soffa,et al.  Low overhead program monitoring and profiling , 2005, PASTE '05.

[24]  C. Law,et al.  Direct Numerical Simulations of Turbulent Lean Premixed Combustion. , 2006 .

[25]  Martin Schulz,et al.  Stack Trace Analysis for Large Scale Debugging , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[26]  Wei-keng Liao,et al.  An Implementation and Evaluation of Client-Side File Caching for MPI-IO , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[27]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[28]  Bin Zhou,et al.  Scalable Performance of the Panasas Parallel File System , 2008, FAST.

[29]  Robert Latham,et al.  24/7 Characterization of petascale I/O workloads , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[30]  Mahmut T. Kandemir,et al.  Automated Tracing of I/O Stack , 2010, EuroMPI.

[31]  Wolfgang E. Nagel,et al.  VAMPIR: Visualization and Analysis of MPI Resources , 2010 .

[32]  Vasanth Bala,et al.  Dynamo: a transparent dynamic optimization system , 2000, SIGP.