A Hierarchical RAID Architecture Towards Fast Recovery and High Reliability

Disk failures are very common in modern storage systems due to the large number of inexpensive disks. As a result, it takes a long time to recover a failed disk due to its large capacity and limited I/O. To speed up the recovery process and maintain a high system reliability, we propose a hierarchical code architecture with erasure codes, OI-RAID, which consists of two layers of codes, outer layer code and inner layer code. Specifically, the outer layer code is deployed with disk grouping technique based on Balanced Incomplete Block Design (BIBD) or complete graph with skewed data layout to provide efficient parallel I/O of all disks for fast failure recovery, and the inner layer code is deployed within each group of disks to provide high reliability. As an example, we deploy RAID5 in both layers to achieve fault tolerance of at least three disk failures, which meets the requirement of data availability in practical systems, as well as much higher speed up ratio for disk failure recovery than existing approaches. Besides, OI-RAID also keeps the optimal data update complexity and incurs low storage overhead in practice.

[1]  Suh-Yin Lee,et al.  Multi-Partition RAID: A New Method for Improving Performance of Disk Arrays under Failure , 1997, Comput. J..

[2]  Gertrude M. Cox,et al.  Enumeration and Construction of Balanced Incomplete Block Configurations , 1940 .

[3]  John C. S. Lui,et al.  Single Disk Failure Recovery for X-Code-Based Parallel Storage Systems , 2014, IEEE Transactions on Computers.

[4]  F. Cristian,et al.  Declustered disk array architectures with optimal and near-optimal parallelism , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[5]  J. Sikora Disk failures in the real world : What does an MTTF of 1 , 000 , 000 hours mean to you ? , 2007 .

[6]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[7]  R. J. McEliece,et al.  On sharing secrets and Reed-Solomon codes , 1981, CACM.

[8]  Hong Jiang,et al.  PRO: A Popularity-based Multi-threaded Reconstruction Optimization for RAID-Structured Storage Systems , 2007, FAST.

[9]  J. Menon,et al.  Distributed sparing in disk arrays , 1992, Digest of Papers COMPCON Spring 1992.

[10]  John C. S. Lui,et al.  Optimal recovery of single disk failure in RDP code storage systems , 2010, SIGMETRICS '10.

[11]  Daniel P. Siewiorek,et al.  Fast, on-line failure recovery in redundant disk arrays , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[12]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[13]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[14]  Kannan Ramchandran,et al.  Fractional repetition codes for repair in distributed storage systems , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  J. Singer A theorem in finite projective geometry and some applications to number theory , 1938 .

[16]  John C. S. Lui,et al.  A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation , 2011, TOS.

[17]  Jehoshua Bruck,et al.  Zigzag Codes: MDS Array Codes With Optimal Rebuilding , 2011, IEEE Transactions on Information Theory.

[18]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[19]  John C. S. Lui,et al.  Performance Analysis of Disk Arrays under Failure , 1990, VLDB.

[20]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[21]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[22]  Jehoshua Bruck,et al.  X-Code: MDS Array Codes with Optimal Encoding , 1999, IEEE Trans. Inf. Theory.

[23]  Jiwu Shu,et al.  Seek-Efficient I/O Optimization in Single Failure Recovery for XOR-Coded Storage Systems , 2017, IEEE Trans. Parallel Distributed Syst..

[24]  Hong Jiang,et al.  WorkOut: I/O Workload Outsourcing for Boosting RAID Reconstruction Performance , 2009, FAST.

[25]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[26]  Garth A. Gibson,et al.  Parity declustering for continuous operation in redundant disk arrays , 1992, ASPLOS V.

[27]  Qing Yang,et al.  S2-RAID: A new RAID architecture for fast data recovery , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[28]  Cheng Huang,et al.  STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures , 2005, IEEE Transactions on Computers.