Design and Implementation of GeoFS: A Wide-Area File System

We propose GeoFS, a POSIX-compliant, wide-area distributed file system, which is used for sharing files between sites. GeoFS uses FUSE to provide standard file system interfaces to applications, it allows users to control over consistency and replication via extended attributes. In the era of big data, traditional file systems do not adapt well to update a large directory (i.e. Huge number of files in one directory), if there is a small fraction of changes in a directory, the whole cache of the directory metadata must be discarded, and a new copy will be obtained from remote server, resulting in poor performance. We address this issue by partitioning metadata into blocks, and only transferring modified block(s) over the network. GeoFS also supports client caching, prefetching, parallel read and compression to make it suitable for use in networks with high latency and low bandwidth characteristics. Performance tests demonstrate that Geo FS outperforms NFS in a wide-area environment.