The system recovery benchmark

We describe a benchmark for measuring system recovery on a nonclustered standalone system. A system's ability to recover from an outage quickly is a critical factor in overall system availability. General purpose computer systems, such as UNIX based systems, tend to execute the same sequence or series of steps during system startup and outage recovery. Our experience has shown that these steps are consistent, reproducible and measurable, and can thus be benchmarked. Additionally, the factors that create variability in restart/recovery can be bound and represented in a meaningful way. A defined set of measurements, coupled with a specification for representing the results and system variables, provide the foundation for system recovery benchmarking.

[1]  Ravishankar K. Iyer,et al.  An approach towards benchmarking of fault-tolerant commercial systems , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[2]  Ravishankar K. Iyer,et al.  NFTAPE: a framework for assessing dependability in distributed systems with lightweight fault injectors , 2000, Proceedings IEEE International Computer Performance and Dependability Symposium. IPDS 2000.

[3]  Ji Zhu,et al.  R-cubed (R 3 ): rate, robustness, and recovery - an availability benchmark framework , 2002 .

[4]  Daniel P. Siewiorek,et al.  Measuring Software Dependability by Robustness Benchmarking , 1997, IEEE Trans. Software Eng..

[5]  Ji Zhu,et al.  A system recovery benchmark for clusters , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[6]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[7]  David A. Patterson,et al.  Towards Availability Benchmarks: A Case Study of Software RAID Systems , 2000, USENIX Annual Technical Conference, General Track.

[8]  Joe Marshall,et al.  Measuring robustness of a fault tolerant aerospace system , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[9]  Daniel P. Siewiorek,et al.  Development of a benchmark to measure system robustness , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[10]  Daniel P. Siewiorek,et al.  Comparing operating systems using robustness benchmarks , 1997, Proceedings of SRDS'97: 16th IEEE Symposium on Reliable Distributed Systems.