GPFS: A Shared-Disk File System for Large Computing Clusters

GPFS is IBM's parallel, shared-disk file system for cluster computers, available on the RS/6000 SP parallel supercomputer and on Linux clusters. GPFS is used on many of the largest supercomputers in the world. GPFS was built on many of the ideas that were developed in the academic community over the last several years, particularly distributed locking and recovery technology. To date it has been a matter of conjecture how well these ideas scale. We have had the opportunity to test those limits in the context of a product that runs on the largest systems in existence. While in many cases existing ideas scaled well, new approaches were necessary in many key areas. This paper describes GPFS, and discusses how distributed locking and recovery techniques were extended to scale to large clusters.

[1]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[2]  Ronald Fagin,et al.  Extendible hashing—a fast access method for dynamic files , 1979, ACM Trans. Database Syst..

[3]  C. Mohan,et al.  Recovery and Coherency-Control Protocols for Fast Intersystem Page Transfer and Fine-Granularity Locking in a Shared Disks Transaction Environment , 1991, VLDB.

[4]  Murthy V. Devarakonda,et al.  Distributed token management in Calypso file system , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[5]  Wei Hu,et al.  Scalability in the XFS File System , 1996, USENIX Annual Technical Conference.

[6]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[7]  Chandramohan A. Thekkath,et al.  Frangipani: a scalable distributed file system , 1997, SOSP.

[8]  Roger L. Haskin,et al.  Tiger Shark - A scalable file system for multimedia , 1998, IBM J. Res. Dev..

[9]  Grant Erickson,et al.  A 64-bit, shared disk file system for Linux , 1999, 16th IEEE Symposium on Mass Storage Systems in cooperation with the 7th NASA Goddard Conference on Mass Storage Systems and Technologies (Cat. No.99CB37098).

[10]  Grant Erickson,et al.  Implementing Journaling in a Linux Shared Disk File System , 2000, IEEE Symposium on Mass Storage Systems.

[11]  Charlotte Brooks,et al.  A Practical Guide to Tivoli SANergy , 2001 .