Combining I/O operations for multiple array variables in parallel netCDF

Parallel netCDF (PnetCDF) is a popular library used in many scientific applications to store scientific datasets. It provides high-performance parallel I/O while maintaining file-format compatibility with Unidata's netCDF. Array variables comprise the bulk of the data in a netCDF dataset, and for accesses to large regions of single array variables, PnetCDF attains very high performance. However, the current PnetCDF interface only allows access to one array variable per call. If an application instead accesses a large number of small-sized array variables, this interface limitation can cause significant performance degradation, because high end network and storage systems deliver much higher performance with larger request sizes. Moreover, the record variables data is stored interleaved by record, and the contiguity information is lost, so the existing MPI-IO collective I/O optimization can not help. This paper presents a new mechanism for PnetCDF to combine multiple I/O operations for better I/O performance. This mechanism can be used in a new function that takes arguments for reading/writing multiple array variables, allowing application programmers to explicitly access multiple array variables in a single call. It can also be used in the implementation of asynchronous I/O functions, so that the combination is carried out implicitly, without changes to the application. Our performance results demonstrate significant improvement using well-known application benchmarks.

[1]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[2]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[3]  Marianne Winslett,et al.  Improving MPI-IO output performance with active buffering plus threads , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[4]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[5]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[6]  Wei-keng Liao,et al.  Collective caching: application-aware client-side file caching , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[7]  Robert B. Haber Scientific visualization and the Rivers Project at the National Center for Supercomputing Applications , 1989, Computer.

[8]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[9]  Geoffrey M. Davis,et al.  NetCDF User''s Guide for C , 1997 .

[10]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[11]  Russ Rew,et al.  NetCDF: an interface for scientific data access , 1990, IEEE Computer Graphics and Applications.

[12]  Rajeev Thakur,et al.  Users guide for ROMIO: A high-performance, portable MPI-IO implementation , 1997 .

[13]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[14]  Rajeev Thakur,et al.  A Case for Using MPI's Derived Datatypes to Improve I/O Performance , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[15]  B. Fryxell,et al.  FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes , 2000 .

[16]  Wei-keng Liao,et al.  Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[17]  Marianne Winslett,et al.  Server-Directed Collective I/O in Panda , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[18]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[19]  Wei-keng Liao,et al.  Scalable Design and Implementations for MPI Parallel Overlapping I/O , 2006, IEEE Transactions on Parallel and Distributed Systems.

[20]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[21]  Rajeev Thakur,et al.  An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays , 1996, Sci. Program..

[22]  Robert Latham,et al.  Parallel netCDF: A Scientific High-Performance I/O Interface , 2003, ArXiv.