Stonehenge: a fault-tolerant real-time network-attached storage device

Stonehenge is a real-time network-attached storage device (NASD) that guarantees real-time data delivery to network clients even across single-disk failures. Stonehenge supports both best-effort and real-time disk read/write services, which are accessed through an object-based interface. Data access requests sent to Stonehenge can be serviced in a server push or a client pull mode. Stonehenge's ability to guarantee real-time disk performance results from a cycle-based scan-order disk scheduling mechanism. However, Stonehenge's disk I/O cycle is either completely utilized or completely idle. This on-off disk scheduling model effectively reduces the power consumption of the disk subsystem, without increasing the buffer size requirement. Finally Stonehenge exploits unused disk storage space and maintains additional redundancy dynamically beyond the RAIDS-style parity. This extra redundancy, typically in the form of disk block replication, reduces the time to reconstruct the data on the failed disk. This paper describes the system architecture of Stonehenge and reports preliminary performance measurements collected from an initial Linux-based prototype implementation using Fast Ethernet and UltraSCSI disks.