GFS: A Distributed File System with Multi-source Data Access and Replication for Grid Computing

In this paper, we design and implement a distributed file system with multi-source data replication ability, called Grid File System (GFS), for Unix-based grid systems. Traditional distributed file system technologies designed for local and campus area networks do not adapt well to wide area grid computing environments. Therefore, we design GFS file system that meets the needs of grid computing. With GFS, existing applications are able to access remote files without any modification, and jobs submitted in grid systems can access data transparently with GFS. GFS can be easily deployed and can be easily accessed without special accounts. Our system also provides strong security mechanisms and a multi-source data transfer method to increase communication throughput.