Evaluation of Applied Intra-disk Redundancy Schemes to Improve Single Disk Reliability

Exponentially growing capacities of disk drives have increased the problem that not only a complete disk can fail, but also individual, small groups of sectors can be erroneous. These sector errors are especially critical during RAID rebuilds because they can only be detected when the corresponding sectors are read. Mechanisms to cope with sector errors, therefore, have become an important way to improve disk reliability. One approach to deal with sector errors is the introduction of intra-disk redundancy, where additional redundancy blocks are calculated and stored for each set of disk sectors. Previous studies have introduced intra-disk redundancy schemes and have evaluated their impact on disk reliability. None of these studies has evaluated the influence on disk drive performance or the underlying energy consumption. The study presented in this paper benchmarks existing schemes concerning these metrics. It shows the surprising result that weaker codes combined with newly introduced scrambling techniques can produce faster layouts with similar reliability properties than previously proposed strong codes.

[1]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[2]  Bianca Schroeder,et al.  Understanding latent sector errors and how to protect against them , 2010, TOS.

[3]  Catherine D. Schuman,et al.  A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage , 2009, FAST.

[4]  Andrea C. Arpaci-Dusseau,et al.  An analysis of data corruption in the storage stack , 2008, TOS.

[5]  Henry M. Tufo,et al.  Tornado Codes for MAID Archival Storage , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).

[6]  Ajay Dholakia,et al.  A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors , 2006, TOS.

[7]  Spencer W. Ng,et al.  Disk scrubbing in large archival storage systems , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[8]  J. Gim,et al.  DIG: Rapid Characterization of Modern Hard Disk Drive and Its Performance Implication , 2008, 2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os.

[9]  Jehoshua Bruck,et al.  EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures , 1995, IEEE Trans. Computers.

[10]  Erez Zadok,et al.  Evaluating Performance and Energy in File System Server Workloads , 2010, FAST.

[11]  James Lee Hafner,et al.  WEAVER codes: highly fault tolerant erasure codes for storage systems , 2005, FAST'05.

[12]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[13]  Julian Satran,et al.  Internet Small Computer Systems Interface (iSCSI) , 2004, RFC.

[14]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[15]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[16]  Jin Qian,et al.  A Linux Implementation Validation of Track-Aligned Extents and Track-Aligned RAIDs , 2008, USENIX Annual Technical Conference.

[17]  George Goldberg,et al.  Leveraging disk drive acoustic modes for power management , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[18]  Gregory R. Ganger,et al.  Track-Aligned Extents: Matching Access Patterns to Disk Drive Characteristics , 2002, FAST.

[19]  James S. Plank The RAID-6 Liberation Codes , 2008, FAST.

[20]  Christos Faloutsos,et al.  On multidimensional data and modern disks , 2005, FAST'05.

[21]  Evangelos Eleftheriou,et al.  Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems , 2008, SIGMETRICS '08.

[22]  Dirk Meister,et al.  hashFS: Applying Hashing to Optimize File Systems for Small File Reads , 2010, 2010 International Workshop on Storage Network Architecture and Parallel I/Os.

[23]  Andrea C. Arpaci-Dusseau,et al.  IRON file systems , 2005, SOSP '05.

[24]  Arvind Krishnamurthy,et al.  Modeling Hard-Disk Power Consumption , 2003, FAST.