Noncontiguous I/O through PVFS

With the tremendous advances in processor and memory technology, I/O has risen to become the bottleneck in high-performance computing for many applications. The development of parallel file systems has helped to ease the performance gap, but I/O still remains an area needing significant performance improvement. Research has found that noncontiguous I/O access patterns in scientific applications combined with current file system methods, to perform these accesses lead to unacceptable performance for large data sets. To enhance performance of noncontiguous I/O, we have created list I/O, a native version of noncontiguous I/O. We have used the Parallel Virtual File System (PVFS) to implement our ideas. Our research and experimentation shows that list I/O outperforms current noncontiguous I/O access methods in most I/O situations and can substantially enhance the performance of real-world scientific applications.

[1]  Carla Schlatter Ellis,et al.  File-Access Characteristics of Parallel Scientific Workloads , 1996, IEEE Trans. Parallel Distributed Syst..

[2]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[3]  Sandra Johnson Baylor,et al.  Parallel I/O Workload Characteristics Using Vesta , 1996, Input/Output in Parallel and Distributed Computer Systems.

[4]  Robert B. Ross,et al.  A case study in application I/O on Linux clusters , 2001, SC.

[5]  Rajeev Thakur,et al.  Passion: Optimized I/O for Parallel Applications , 1996, Computer.

[6]  Robert Ross,et al.  Implementation and performance of a parallel file system for high performance distributed applications , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[7]  Andrew A. Chien,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, SC.

[8]  Remy Evard Chiba city: the Argonne scalable cluster , 2001 .

[9]  Andrew A. Chien,et al.  I/O requirements of scientific applications: an evolutionary view , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[10]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[11]  B. Fryxell,et al.  FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes , 2000 .

[12]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.