BeanFS: A Distributed File System for Large-scale E-mail Services

Distributed file systems running on a cluster of inexpensive commodity hardware are being recognized as an effective solution to support the explosive growth of storage demand in large-scale Internet service companies. This paper presents the design and implementation of BeanFS, a distributed file system for large-scale e-mail services. BeanFS is adapted to e-mail services as follows. First, the volume-based replication scheme alleviates the metadata management overhead of the central metadata server in dealing with a very large number of small files. Second, BeanFS employs a light-weighted consistency maintenance protocol tailored to simple access patterns of e-mail message. Third, transient and permanent failures are treated separately and recovering from transient failures is done quickly and has less overhead.