The Cluster File System: Integration of High Performance Communication and I/O in Clusters

In this paper, we report on the experiences in designing a portable parallel file system for clusters. The file system offers to the applications an interface compliant with MPI-IO, the I/O interface of the MPI-2 standard. The file system implementation relies upon MPI for internal coordination and communication. This guarantees high performance and portability over a wide range of hardware and software cluster platforms. The internal architecture of the file system has been designed to allow rapid prototyping and experimentation of novel strategies for managing parallel I/O in a cluster environment. The discussion of the file system design and early implementation is completed with basic performance measures confirming the potential of the approach.

[1]  George Lawton New I/O Technologies Seek to End Bottlenecks , 2001, Computer.

[2]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[3]  Andrew A. Chien,et al.  PPFS: a high performance portable parallel file system , 1995, ICS '95.

[4]  Gil Utard,et al.  MPI-IO on a parallel file system for cluster of workstations , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[5]  Maciej Golebiewski,et al.  High Performance Implementation of MPI for Myrinet , 1999, ACPC.

[6]  Scott Pakin,et al.  High Performance Virtual Machines (HPVM'S): Clusters with Supercomputing API's and Performance , 1997, PPSC.

[7]  Hiroshi Tezuka,et al.  The design and implementation of zero copy MPI using commodity hardware with a high performance network , 1998, ICS '98.

[8]  Gregory F. Pfister,et al.  In Search of Clusters , 1995 .

[9]  Giulio Iannello,et al.  Performance analysis of storage and network subsystems in cluster architectures , 2000, Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000.

[10]  José M. Bernabéu-Aubán,et al.  Solaris MC: A Multi Computer OS , 1996, USENIX Annual Technical Conference.

[11]  Hai Jin,et al.  Single I/O space for scalable cluster computing , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[12]  Giulio Iannello,et al.  MPI Derived Data Types Support in VIRTUS , 2000, CANPC.

[13]  Giulio Iannello,et al.  A Scalable Flow Control Algorithm for the Fast Messages Communication Library , 1999, CANPC.

[14]  Erik Riedel,et al.  A performance study of sequential I/O on windows NT TM 4 , 1998 .

[15]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[16]  Walter B. Ligon,et al.  An Overview of the Parallel Virtual File System , 1999 .

[17]  Rajeev Thakur,et al.  Users guide for ROMIO: A high-performance, portable MPI-IO implementation , 1997 .

[18]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.