A File Level RAID in Blue Whale File System

Blue Whale File System (BWFS) is a distributed file system with proven high performance and high scalability. In order to provide high reliability for BWFS, we designed and implemented a new architecture named BW-FILERAID, which combines distributed file system well with the RAID technology. BW-FILERAID uses client-driven file-level RAID to satisfy the requirement of reliability and optimizes the RAID performance by utilizing the file system information. The schemes of Dynamic Stripe Allocation (DSA) and Asynchronous Parity Calculating (APC) are proposed to reduce the impact of small write performance problem which is incurred by classic RAID5 implementation. This paper introduces and evaluates BW-FILERAID. As compared with the Linux software RAID5 systems, BW-FILERAID can achieve significantly higher WRITE performance which increases by 8.2~25.9% while with the same READ performance as RAID0. At last, BW-FILERAID also has very good system scalability.

[1]  Peter J. Braam,et al.  The Coda Distributed File System , 1998 .

[2]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.

[3]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[4]  GhemawatSanjay,et al.  The Google file system , 2003 .

[5]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[6]  Xu Lu Design and Implementation of File Layout in Blue Whale Cluster File System , 2008 .

[7]  Lu Xu,et al.  Performance Optimization under Small Files Intensive Workloads in BWFS , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.

[8]  Bin Zhou,et al.  Scalable Performance of the Panasas Parallel File System , 2008, FAST.

[9]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[10]  Xiaoming Han,et al.  Volume Based Metadata Isolation in Blue Whale Cluster File System , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[11]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[12]  Huang Hua,et al.  BWFS: A Distributed File System with Large Capacity, High Throughput and High Scalability , 2005 .

[13]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[14]  Liu Shi,et al.  Client Based Data Isolation of Blue Whale File System in Non-linear Edit Field , 2010, 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC).