Clusterfile: a flexible physical layout parallel file system

This paper presents Clusterfile, a parallel file system that provides parallel file access on a cluster of computers. We introduce a file partitioning model that has been used in the design of Clusterfile. The model uses a data representation that is optimized for multidimensional array partitioning while allowing arbitrary partitions. The paper shows how the file model can be employed for file partitioning into both physical subfiles and logical views. We also present how the conversion between two partitions of the same file is implemented using a general memory redistribution algorithm. We show how we use the algorithm to optimize non‐contiguous read and write operations. The experimental results include performance comparisons with the Parallel Virtual File System (PVFS) and an MPI‐IO implementation for PVFS. Copyright © 2003 John Wiley & Sons, Ltd.

[1]  Richard Wheeler,et al.  it/sfs: A Parallel File System for the CM-5 , 1993, USENIX Summer.

[2]  Steven A. Moyer,et al.  PIOUS: a scalable parallel I/O system for distributed computing environments , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[3]  Dror G. Feitelson,et al.  The Vesta parallel file system , 1996, TOCS.

[4]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[5]  Florin Isaila,et al.  Mapping functions and data redistribution for parallel files , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[6]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[7]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[8]  D.A. Reed,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[9]  Dror G. Feitelson,et al.  Parallel File Systems for the IBM SP Computers , 1995, IBM Syst. J..

[10]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[11]  Rajkumar Buyya,et al.  2001 IEEE International Conference on Cluster Computing , 2001 .

[12]  Evgenia Smirni,et al.  Workload Characterization of Input/Output Intensive Parallel Applications , 1997, Computer Performance Evaluation.

[13]  E. DeBenedictis,et al.  nCUBE parallel I/O software , 1992, Eleventh Annual International Phoenix Conference on Computers and Communication [1992 Conference Proceedings].

[14]  Prithviraj Banerjee,et al.  Automatic generation of efficient array redistribution routines for distributed memory multicomputers , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[15]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[16]  David B. Loveman High performance Fortran , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[17]  David Kotz,et al.  The galley parallel file system , 1997, ICS '96.

[18]  Daniel A. Reed,et al.  A Comparison of Logical and Physical Parallel I/o pAtterns , 1998, Int. J. High Perform. Comput. Appl..

[19]  David J. DeWitt,et al.  SPIFFI-A Scalable Parallel File System for the Intel Paragon , 1996, IEEE Trans. Parallel Distributed Syst..

[20]  Carla Schlatter Ellis,et al.  File-Access Characteristics of Parallel Scientific Workloads , 1996, IEEE Trans. Parallel Distributed Syst..

[21]  Olin Johnson,et al.  PARADISE: an advanced featured parallel file system , 1998, ICS '98.

[22]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[23]  M. Winslett,et al.  Server-directed collective I/O in Panda , 1995 .

[24]  Jesús Carretero,et al.  ParFiSys: a parallel file system for MPP , 1996, OPSR.