Redundant disk arrays are an increasingly popular way to improve I/O system performance. Past research has studied how to stripe data in non-redundant (RAID Level 0) disk arrays, but none has yet been done on how to stripe data in redundant disk arrays such as RAID Level 5, or on how the choice of striping unit varies with the number of disks. Using synthetic workloads, we derive simple design rules for striping data in RAID Level 5 disk arrays given varying amounts of workload information. We then validate the synthetically derived design rules using real workload traces to show that the design rules apply well to real systems.We find no difference in the optimal striping units for RAID Level 0 and 5 for read-intensive workloads. For write-intensive workloads, in contrast, the overhead of maintaining parity causes full-stripe writes (writes that span the entire error-correction group) to be more efficient than read-modify writes or reconstruct writes. This additional factor causes the optimal striping unit for RAID Level 5 to be four times smaller for write-intensive workloads than for read-intensive workloads.We next investigate how the optimal striping unit varies with the number of disks in an array. We find that the optimal striping unit for reads in a RAID Level 5 varies inversely to the number of disks, but that the optimal striping unit for writes varies with the number of disks. Overall, we find that the optimal striping unit for workloads with an unspecified mix of reads and writes is independent of the number of disks.Together, these trends lead us to recommend (in the absence of specific workload information) that the striping unit over a wide range of RAID Level 5 disk array sizes be equal to 1/2 * average positioning time * disk transfer rate.
[1]
Peter J. Denning,et al.
The Operational Analysis of Queueing Network Models
,
1978,
CSUR.
[2]
Randy H. Katz,et al.
A case for redundant arrays of inexpensive disks (RAID)
,
1988,
SIGMOD '88.
[3]
Jim Gray,et al.
Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput
,
1990,
VLDB.
[4]
David A. Patterson,et al.
Maximizing performance in a striped disk array
,
1990,
[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[5]
Randy H. Katz,et al.
An Analytic Performance Model of Disk Arrays And Its Application
,
1991
.
[6]
Randy H. Katz,et al.
Input/output behavior of supercomputing applications
,
1991,
Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[7]
Tom Keller.
Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems, San Diego, California, USA, May 21-24, 1991
,
1991,
SIGMETRICS.
[8]
Gerhard Weikum,et al.
Automatic tuning of data placement and load balancing in disk arrays
,
1992
.
[9]
Gerhard Weikum,et al.
Tuning of striping units in disk-array-based file systems
,
1992,
[1992 Proceedings] Second International Workshop on Research Issues on Data Engineering: Transaction and Query Processing.
[10]
Randy H. Katz,et al.
An analytic performance model of disk arrays
,
1993,
SIGMETRICS '93.
[11]
Randy H. Katz,et al.
The Performance of Parity Placements in Disk Arrays
,
1993,
IEEE Trans. Computers.
[12]
John Wilkes,et al.
An introduction to disk drive modeling
,
1994,
Computer.