Exploiting redundancy to boost performance in a RAID-10 style cluster-based file system

While aggregating the throughput of existing disks on cluster nodes is a cost-effective approach to alleviate the I/O bottleneck in cluster computing, this approach suffers from potential performance degradations due to contentions for shared resources on the same node between storage data processing and user task computation. This paper proposes to judiciously utilize the storage redundancy in the form of mirroring existed in a RAID-10 style file system to alleviate this performance degradation. More specifically, a heuristic scheduling algorithm is developed, motivated from the observations of a simple cluster configuration, to spatially schedule write operations on the nodes with less load among each mirroring pair. The duplication of modified data to the mirroring nodes is performed asynchronously in the background. The read performance is improved by two techniques: doubling the degree of parallelism and hot-spot skipping. A synthetic benchmark is used to evaluate these algorithms in a real cluster environment and the proposed algorithms are shown to be very effective in performance enhancement.

[1]  Tao Yang,et al.  An Efficient Data Location Protocol for Self.organizing Storage Clusters , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[2]  Miguel Castro,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OPSR.

[3]  Xiao Qin,et al.  Scheduling for Improved Write Performance in a Cost- Effective, Fault-Tolerant Parallel Virtual File System (CEFT-PVFS) , 2003 .

[4]  Da-Wei Wang,et al.  Optimizing Server Placement for Parallel I/o in Switch-based Clusters Keywords: Cluster Computing Parallel I/o I/o Server Placement Load Balancing Switch-based Cluster Irregular Network Load-balancing Matching Algorithm Load-balancing Tree-traversing Algorithm , 2022 .

[5]  Arif Merchant,et al.  FAB: building distributed enterprise disk arrays from commodity components , 2004, ASPLOS XI.

[6]  Marianne Winslett,et al.  Parallel I/O for scientific applications on heterogeneous clusters: a resource-utilization approach , 1999, ICS '99.

[7]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[8]  Ravi Jain,et al.  Heuristics for Scheduling I/O Operations , 1997, IEEE Trans. Parallel Distributed Syst..

[9]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[10]  Hai Jin,et al.  Orthogonal Striping and Mirroring in Distributed RAID for I/O-Centric Cluster Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[11]  Alan Jay Smith,et al.  The performance impact of I/O optimizations and disk improvements , 2004, IBM J. Res. Dev..

[12]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[13]  Florin Isaila,et al.  Clusterfile: a flexible physical layout parallel file system , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[14]  Ravi Jain,et al.  Parallel I/O scheduling using randomized, distributed edge coloring algorithms , 2003, J. Parallel Distributed Comput..

[15]  JinHai,et al.  Orthogonal Striping and Mirroring in Distributed RAID for I/O-Centric Cluster Computing , 2002 .

[16]  H. Apte,et al.  Serverless Network File Systems , 2006 .

[17]  Jens Mache,et al.  Performance evaluation of parallel file systems for PC clusters and ASCI red , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[18]  Steven A. Moyer,et al.  PIOUS: a scalable parallel I/O system for distributed computing environments , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[19]  Tao Yang,et al.  An Efficient Data Location Protocol for Self.organizing Storage Clusters , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[20]  Mahmut T. Kandemir,et al.  Kernel-level caching for optimizing I/O by exploiting inter-application data sharing , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[21]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.

[22]  Xiao Qin,et al.  Improved read performance in a cost-effective, fault-tolerant parallel virtual file system (CEFT-PVFS) , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[23]  Da-Wei Wang,et al.  Efficient parallel I/O scheduling in the presence of data duplication , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[24]  Hong Jiang,et al.  Hierarchical Bloom filter arrays (HBA): a novel, scalable metadata management system for large cluster-based storage , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[25]  GhemawatSanjay,et al.  The Google file system , 2003 .

[26]  Giulio Iannello,et al.  The Cluster File System: Integration of High Performance Communication and I/O in Clusters , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[27]  Gil Utard,et al.  MPI-IO on a parallel file system for cluster of workstations , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[28]  Xiao Qin,et al.  Design, implementation and performance evaluation of a cost-effective, fault-tolerant parallel virtual file system , 2003, SNAPI@PACT.

[29]  Amin Vahdat,et al.  Interposed request routing for scalable network storage , 2000, TOCS.

[30]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .