Development of a benchmark to measure system robustness: experiences and lessons learned

Performance benchmarks are used to help decide the question: 'which system is faster?' With the increased use of computers in critical systems, there are more and more applications of resources to improve system quality. However, there are no benchmarks that can be used to compare the dependability and robustness of systems in order to answer the question: 'which system is more reliable?' The authors present an attempt at the development of a benchmark to gauge a system's robustness as measured by its ability to tolerate errors. The initial effort produced four primitive benchmark programs. They include file management system, memory access, user application, and C library functions. Each primitive benchmark targets a system functionality and measures its behavior, given erroneous inputs. The authors present the motivation and experimental results for one of these primitive benchmarks in detail followed by an analysis of the results. A methodology is presented to combine the primitive benchmarks to form an overall robustness figure.<<ETX>>