AFRAID - A Frequently Redundant Array of Independent Disks

Disk arrays are commonly designed to ensure that stored data will always be able to withstand a disk failure, but meeting this goal comes at a significant cost in performance. We show that this is unnecessary. By trading away a fraction of the enormous reliability provided by disk arrays, it is possible to achieve performance that is almost as good as a non-parity-protected set of disks. In particular, our AFRAID design eliminates the small-update penalty that plagues traditional RAID 5 disk arrays. It does this by applying the data update immediately, but delaying the parity update to the next quiet period between bursts of client activity. That is, AFRAID makes sure that the array is frequently redundant, even if it isn't always so. By regulating the parity update policy, AFRAID allows a smooth trade-off between performance and availability. Under real-life workloads, the AFRAID design can provide close to the full performance of an array of unprotected disks, and data availability comparable to a traditional RAID 5. Our results show that AFRAID offers 42% better performance for only 10% less availability, 97% better for 23% less, and as much as a factor of 4.1 times better performance for giving up less than half RAID 5's availability. We explore here the detailed availability and performance implications of the AFRAID approach.

[1]  Frederick W. Clegg,et al.  The hp-ux operating system on hp precision architecture computers , 1966 .

[2]  R. S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[3]  Dennis Ritchie,et al.  The UNIX system: The evolution of the UNIX time-sharing system , 1979, AT&T Bell Laboratories Technical Journal.

[4]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[5]  Robert S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[6]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[7]  John Kunze,et al.  A trace-driven analysis of the unix 4 , 1985, SOSP 1985.

[8]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[9]  Randy H. Katz,et al.  How reliable is a RAID? , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[10]  Daniel P. Siewiorek,et al.  Error log analysis: statistical modeling and heuristic trend analysis , 1990 .

[11]  Jim Gray,et al.  A census of Tandem system availability between 1985 and 1990 , 1990 .

[12]  Masataka Ohta,et al.  A Fast /tmp File System by Delay Mount Option , 1990, USENIX Summer.

[13]  John C. S. Lui,et al.  Performance Analysis of Disk Arrays under Failure , 1990, VLDB.

[14]  Jim Gray,et al.  Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput , 1990, VLDB.

[15]  Raymie Stata,et al.  Specifying data availability in multi-device file systems , 1990, OPSR.

[16]  Keith Bostic,et al.  A Pageable Memory Based Filesystem , 1990, USENIX Summer.

[17]  Daniel P. Siewiorek,et al.  High-availability computer systems , 1991, Computer.

[18]  Prithviraj Banerjee,et al.  Gracefully degradable disk arrays , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[19]  D McCaugherty,et al.  Integrating theory and practice. , 1992, Senior nurse.

[20]  J. Menon,et al.  Methods for improved update performance of disk arrays , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[21]  Mary Baker,et al.  Non-volatile memory for fast, reliable file systems , 1992, ASPLOS V.

[22]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[23]  Garth A. Gibson,et al.  Parity declustering for continuous operation in redundant disk arrays , 1992, ASPLOS V.

[24]  John Wilkes,et al.  UNIX Disk Access Patterns , 1993, USENIX Winter.

[25]  David A. Patterson,et al.  Designing Disk Arrays for High Data Reliability , 1993, J. Parallel Distributed Comput..

[26]  Daniel Stodolsky,et al.  Parity logging overcoming the small write problem in redundant disk arrays , 1993, ISCA '93.

[27]  David Kotz,et al.  Integrating Theory and Practice in Parallel File Systems , 1993 .

[28]  Jai Menon,et al.  The architecture of a fault-tolerant cached RAID controller , 1993, ISCA '93.

[29]  Yale N. Patt,et al.  Scheduling algorithms for modern disk drives , 1994, SIGMETRICS 1994.

[30]  John Wilkes,et al.  An introduction to disk drive modeling , 1994, Computer.

[31]  Jehoshua Bruck,et al.  EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures , 1994, ISCA '94.

[32]  John H. Hartman,et al.  The Zebra striped network file system , 1995, TOCS.

[33]  Carl Staelin,et al.  The HP AutoRAID hierarchical storage system , 1995, SOSP.

[34]  Carl Staelin,et al.  Idleness is Not Sloth , 1995, USENIX.