Cooperative Client-Side File Caching for MPI Applications

Client-side file caching is one of many I/O strategies adopted by today's parallel file systems that were initially designed for distributed systems. Most of these implementations treat each client independently because clients' computations are seldom related to each other in a distributed environment. However, it is misguided to apply the same assumption directly to high-performance computers where many parallel I/O operations come from a group of processes working within the same parallel application. Thus, file caching could perform more effectively if the scope of processes sharing the same file is known. In this paper, we propose a client-side file caching system for MPI applications that perform parallel I/O operations on shared files. In our design, an I/O thread is created and runs concurrently with the main thread in each MPI process. The MPI processes that collectively open a shared file use the I/O threads to cooperate with each other to handle file requests, cache page access, and coherence control. By bringing the caching subsystem closer to the applications as a user space library, it can be incorporated into an MPI I/O implementation to increase its portability. Performance evaluations using three I/O benchmarks demonstrate a significant improvement over traditional methods that use either byte-range file locking or rely on coherent I/O provided by the file system.

[1]  B. Fryxell,et al.  FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes , 2000 .

[2]  Corporate Ieee,et al.  Information Technology-Portable Operating System Interface , 1990 .

[3]  Jesús Labarta,et al.  PACA: A Cooperative File System Cache for Parallel Machines , 1996, Euro-Par, Vol. I.

[4]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[5]  Alice E. Koniges,et al.  Towards a High-Performance Implementation of MPI-IO on Top of GPFS , 2000, Euro-Par.

[6]  Andrew A. Chien,et al.  PPFS: a high performance portable parallel file system , 1995, ICS '95.

[7]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[8]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[9]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[10]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[11]  Wei-keng Liao,et al.  DAChe: Direct Access Cache System for Parallel I/O , 2005 .

[12]  Bin Jia,et al.  MPI-IO/GPFS, an Optimized Implementation of MPI-IO on Top of GPFS , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[13]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[14]  Michael Dahlin,et al.  Cooperative caching: using remote client memory to improve file system performance , 1994, OSDI '94.

[15]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[16]  Anna R. Karlin,et al.  Implementing cooperative prefetching and caching in a globally-managed memory system , 1998, SIGMETRICS '98/PERFORMANCE '98.

[17]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[18]  Rob VanderWijngaart,et al.  NAS Parallel Benchmarks I/O Version 2.4. 2.4 , 2002 .

[19]  Bill Nitzberg,et al.  PMPIO-a portable implementation of MPI-IO , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[20]  Hai Jin,et al.  Parallel File Systems , 2002 .

[21]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[22]  Marianne Winslett,et al.  Improving MPI-IO output performance with active buffering plus threads , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[23]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[24]  Florin Isaila,et al.  Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system , 2004, ICS '04.

[25]  Ieee Standards Board System application program interface (API) (C language) , 1990 .