Triple Failure Tolerant Storage Systems Using Only Exclusive-Or Parity Calculations

We present a disk array organization that can survive three simultaneous disk failures while only using exclusive-or operations to calculate the parities that generate this failure tolerance. The reliability of storage systems using magnetic disks depends on how prone individual disks are to failure. Unfortunately, disk failure rates are impossible to predict and it is well known that individual batches might be subject to much higher failure rates at some point during their lifetime. It is also known that many disk drive families, but not all, suffer a substantially higher failure rate at the beginning and some at the end of their economic lifespan. Our proposed organization can be built on top of a dense two-failure tolerant layout using only exclusive-or operations and with a ratio of parity to data disks of 2/k. If the disk failure rates are higher than expected, the new organization can be super-imposed on the existing two-failure tolerant organization by introducing (k+1)/2 new parity disks and (k+1)/2 new reliability stripes to yield a three-failure tolerant layout without moving any data or calculating any other parity but the new one. We derive the organization using a graph visualization and a construction by Lawless of factoring a complete graph into paths.

[1]  Zhou Jie The Study of Graph Decompositions and Placement of Parity and Data to Tolerate Two Failures in Disk Arrays: Conditions and Existance , 2003 .

[2]  Jehoshua Bruck,et al.  Low density MDS codes and factors of complete graphs , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[3]  N. S. Mendelsohn,et al.  Handcuffed designs , 1977, Discret. Math..

[4]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[5]  Cheng Huang,et al.  In Search of I/O-Optimal Recovery from Disk Failures , 2011, HotStorage.

[6]  Ahmed Amer,et al.  Protecting RAID Arrays against Unexpectedly High Disk Failure Rates , 2014, 2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing.

[7]  Ethan L. Miller,et al.  Reliability of flat XOR-based erasure codes on heterogeneous devices , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[8]  Ahmed Amer,et al.  Highly reliable two-dimensional RAID arrays for archival storage , 2012, 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC).

[9]  Robert Michael Tanner,et al.  A recursive approach to low complexity codes , 1981, IEEE Trans. Inf. Theory.

[10]  GhemawatSanjay,et al.  The Google file system , 2003 .

[11]  Ahmad Patooghy,et al.  A Low-Power and SEU-Tolerant Switch Architecture for Network on Chips , 2007 .

[12]  Cheng Huang,et al.  STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures , 2005, IEEE Transactions on Computers.

[13]  Prince Camille de Polignac On a Problem in Combinations , 1866 .

[14]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[15]  Robert H. Deng,et al.  New efficient MDS array codes for RAID. Part I. Reed-Solomon-like codes for tolerating three disk failures , 2005, IEEE Transactions on Computers.

[16]  James S. Plank,et al.  A practical analysis of low-density parity-check erasure codes for wide-area storage applications , 2004, International Conference on Dependable Systems and Networks, 2004.

[18]  Bianca Schroeder,et al.  Understanding latent sector errors and how to protect against them , 2010, TOS.

[19]  Darrell D. E. Long,et al.  Reliability of Disk Arrays with Double Parity , 2013, 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing.

[20]  Gang Wang,et al.  Combinatorial Constructions of Multi-erasure-Correcting Codes with Independent Parity Symbols for Storage Systems , 2007, 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007).

[21]  Darrell D. E. Long,et al.  Three-Dimensional Redundancy Codes for Archival Storage , 2013, 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems.

[22]  Xiaozhou Li,et al.  Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[23]  J. F. Lawless,et al.  On the Construction of Handcuffed Designs , 1974, J. Comb. Theory A.

[24]  Zheng Shao,et al.  Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.

[25]  Shankar Pasupathy,et al.  An analysis of latent sector errors in disk drives , 2007, SIGMETRICS '07.

[26]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[27]  Bianca Schroeder,et al.  Understanding failures in petascale computers , 2007 .