Reliability Analysis of Highly Redundant Distributed Storage Systems with Dynamic Refuging

In recent data centres, large-scale storage systems storing big data comprise thousands of large-capacity drives. Our goal is to establish a method for building highly reliable storage systems using more than a thousand low-cost large-capacity drives. Some large-scale storage systems protect data by erasure coding to prevent data loss. As the redundancy level of erasure coding is increased, the probability of data loss will decrease, but the increase in normal data write operation and additional storage for coding will be incurred. We therefore need to achieve high reliability at the lowest possible redundancy level. There are two concerns regarding reliability in large-scale storage systems: (i) as the number of drives increases, systems are more subject to multiple drive failures and (ii) distributing stripes among many drives can speed up the rebuild time but increase the risk of data loss due to multiple drive failures. These concerns were not addressed in prior quantitative reliability studies based on realistic settings. In this work, we analyze the reliability of large-scale storage systems with distributed stripes, focusing on an effective rebuild method which we call Dynamic Refuging. Dynamic Refuging rebuilds failed storage areas from those with the lowest redundancy and strategically selects blocks to read for repairing lost data. We modeled the dynamically changing amount of storage at each redundancy level due to multiple drive failures, and performed reliability analysis with Monte Carlo simulation using realistic drive failure characteristics. When stripes with redundancy level 3 were sufficiently distributed and rebuilt by Dynamic Refuging, we found that the probability of data loss decreased by two orders of magnitude for systems with 384 or more drives compared to normal RAID. This technique turned out to scale well, and a system with 1536 inexpensive drives attained lower data loss probability than RAID 6 with 16 enterprise-class drives.

[1]  Hai Jin,et al.  RAID-x: a new distributed disk array for I/O-centric cluster computing , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[2]  James S. Plank,et al.  Mean Time to Meaningless: MTTDL, Markov Models, and Storage System Reliability , 2010, HotStorage.

[3]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[4]  Hai Jin,et al.  Reliable cluster computing with a new checkpointing RAID-x architecture , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[5]  Ethan L. Miller,et al.  Evaluation of distributed recovery in large-scale storage systems , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[6]  John Gantz,et al.  The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East , 2012 .

[7]  Jiri Schindler,et al.  Beyond MTTDL: A Closed-Form RAID 6 Reliability Equation , 2014, TOS.

[8]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[9]  Kathrin Peter Reliability Study of Coding Schemes for Wide-Area Distributed Storage Systems , 2011, 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing.

[10]  Joseph G. Slember,et al.  GPFS Scans 10 Billion Files in 43 Minutes , 2011 .

[11]  Jie Li,et al.  Reliability analysis of disk array organizations by considering uncorrectable bit errors , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.

[12]  André Brinkmann,et al.  Reliability Analysis of Declustered-Parity RAID 6 with Disk Scrubbing and Considering Irrecoverable Read Errors , 2010, 2010 IEEE Fifth International Conference on Networking, Architecture, and Storage.

[13]  Michael G. Pecht,et al.  A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID) , 2009, IEEE Transactions on Computers.

[14]  Adam Leventhal,et al.  Triple-Parity RAID and Beyond , 2009, ACM Queue.