Data sieving and collective I/O in ROMIO

The I/O access patterns of parallel programs often consist of accesses to a large number of small, noncontiguous pieces of data. If an application's I/O needs are met by making many small, distinct I/O requests, however, the I/O performance degrades drastically. To avoid this problem, MPI-IO allows users to access a noncontiguous data set with a single I/O function call. This feature provides MPI-IO implementations an opportunity to optimize data access. We describe how our MPI-IO implementation, ROMIO, delivers high performance in the presence of noncontiguous requests. We explain in detail the two key optimizations ROMIO performs: data sieving for noncontiguous requests from one process and collective I/O for noncontiguous requests from multiple processes. We describe how one can implement these optimizations portably on multiple machines and file systems, control their memory requirements, and also achieve high performance. We demonstrate the performance and portability with performance results for three applications-an astrophysics-application template (DIST3D) the NAS BTIO benchmark, and an unstructured code (UNSTRUC)-on five different parallel machines: HP Exemplar IBM SP, Intel Paragon, NEC SX-4, and SGI Origin2000.

[1]  P. R. Cappello,et al.  Implementing the beam and warming method on the hypercube , 1989, C3P.

[2]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[3]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[4]  M. Winslett,et al.  Server-Directed Collective I/O in Panda , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[5]  D.A. Reed,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[6]  Dror G. Feitelson,et al.  Mpi-io: a parallel file i/o interface for mpi , 1995 .

[7]  Andrew A. Chien,et al.  I/O requirements of scientific applications: an evolutionary view , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[8]  Rajeev Thakur,et al.  An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays , 1996, Sci. Program..

[9]  Rajeev Thakur,et al.  An Experimental Evaluation of the Parallel I/O Systems of the IBM SP and Intel Paragon Using a Production Application , 1996, ACPC.

[10]  Bill Nitzberg,et al.  PMPIO-a portable implementation of MPI-IO , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[11]  Carla Schlatter Ellis,et al.  File-Access Characteristics of Parallel Scientific Workloads , 1996, IEEE Trans. Parallel Distributed Syst..

[12]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[13]  E. Lusk,et al.  An abstract-device interface for implementing portable parallel-I/O interfaces , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[14]  Rajeev Thakur,et al.  Passion: Optimized I/O for Parallel Applications , 1996, Computer.

[15]  Sandra Johnson Baylor,et al.  Parallel I/O Workload Characteristics Using Vesta , 1996, Input/Output in Parallel and Distributed Computer Systems.

[16]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[17]  Evgenia Smirni,et al.  Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications , 1998, Perform. Evaluation.

[18]  Rajeev Thakur,et al.  A Case for Using MPI's Derived Datatypes to Improve I/O Performance , 1998, Proceedings of the IEEE/ACM SC98 Conference.