Boosting Application-Specific Parallel I/O Optimization Using IOSIG

Many scientific applications spend a significant portion of their execution time in accessing data from files. Various optimization techniques exist to improve data access performance, such as data prefetching and data layout optimization. However, optimization process is usually a difficult task due to the complexity involved in understanding I/O behavior. Tools that can help simplify the optimization process have a significant importance. In this paper, we introduce a tool, called IOSIG, for providing a better understanding of parallel I/O accesses and information to be used for optimization techniques. The tool enables tracing parallel I/O calls of an application and analyzing the collected information to provide a clear understanding of I/O behavior of the application. We show that performance overheads of the tool in trace collection and analysis are negligible. The analysis step creates I/O signatures that various optimizations can use for improving I/O performance. I/O signatures are compact, easy-to-understand, and parameterized representations containing data access pattern information such as size, strides between consecutive accesses, repetition, timing, etc. The signatures include local I/O behavior for each process and global behavior for an overall application. We illustrate the usage of the IOSIG tool in data prefetching and data layout optimizations.

[1]  Xian-He Sun,et al.  A cost-intelligent application-specific data layout scheme for parallel file systems , 2011, HPDC '11.

[2]  T.M. Madhyastha,et al.  Exploiting Global Input Output Access Pattern Classification , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[3]  Karthik Vijayakumar,et al.  Scalable I/O tracing and analysis , 2009, PDSW '09.

[4]  Robert Latham,et al.  24/7 Characterization of petascale I/O workloads , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[5]  William Gropp,et al.  Toward Scalable Performance Visualization with Jumpshot , 1999, Int. J. High Perform. Comput. Appl..

[6]  Philip C. Roth,et al.  Characterizing the I/O behavior of scientific applications on the Cray XT , 2007, PDSW '07.

[7]  Robert Latham,et al.  Understanding and improving computational science storage access through continuous characterization , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[8]  Hao Yu,et al.  Early experiences in application level I/O tracing on blue gene systems , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[9]  Daniel A. Reed,et al.  Input/output access pattern classification using hidden Markov models , 1997, IOPADS '97.

[10]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[11]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[12]  Ewing Lusk,et al.  Studying parallel program behavior with upshot , 1991 .

[13]  Daniel A. Reed,et al.  Learning to Classify Parallel Input/Output Access Patterns , 2002, IEEE Trans. Parallel Distributed Syst..

[14]  Wei-keng Liao,et al.  An Implementation and Evaluation of Client-Side File Caching for MPI-IO , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[15]  Michael Ott,et al.  Automatic performance analysis with periscope , 2010, Concurr. Comput. Pract. Exp..

[16]  Samuel Lang,et al.  A Segment-Level Adaptive Data Layout Scheme for Improved Load Balance in Parallel File Systems , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[17]  Bernd Mohr,et al.  KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Programs , 2003, Euro-Par.

[18]  Surendra Byna,et al.  Parallel I/O prefetching using MPI file caching and I/O signatures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.